Re: Automatic Failover to different Data Center.

2018-05-07 Thread Sanel Zukan
Here is nice explanation [1] what your options are. HDFS does not support replication between clusters [2]. If you are using HBase, things are better. > In case hadoop cluster in DC1 goes down , Automatic failover occurs to > DC2. There are setups with DRBD, but if you can afford small loss betwe

Re: Automatic Failover to different Data Center.

2018-05-07 Thread Wei-Chiu Chuang
Distcp is a backup tool, not a synchronization tool. At best, you get a point-in-time snapshot of the DC1. For example, a period schedule of distcp every night at 12am. But in case of total failure, you lose everything from that point in time. On Mon, May 7, 2018 at 12:30 AM, akshay naidu wrote:

Re: using Kerberos with certificate for authenticating Hadoop components instead of login/password keytabs

2018-05-06 Thread Rajiv Chittajallu
Hi Dominique, I think you are referring to PKINIT. This is applicable for getting initial TGT. As for region servers (and other similar components in hadoop), the principal is used in two contexts, one as a service and other as a client. * A service to HBase Client To replace service principal wi

Re: using Kerberos with certificate for authenticating Hadoop components instead of login/password keytabs

2018-05-02 Thread Benoy Antony
Sorry Dominique for the late reply. For components like hadoop servers or hbase servers , currently it requires a keytab file to authenticate with KDC and obtain TGT. So AFAIK , the authentication between Hadoop/hbase server and KDC cannot use certificate. cheers. Benoy On Fri, Apr 6, 2018 at 6

Re: Read or save specific blocks of a file

2018-05-01 Thread Thodoris Zois
That’s what I did :) If you need further information I can post my solution.. - Thodoris > On 30 Apr 2018, at 22:23, David Quiroga wrote: > > There might be a better way... but I wonder if it might be possible to access > the node where the block is store and read it from the local file syste

Re: Premature EOF and RemoteException

2018-05-01 Thread Fadzly Zahari
atanode(s) From: David Quiroga Sent: Tuesday, May 1, 2018 3:35:47 AM To: Fadzly Zahari Cc: user@hadoop.apache.org Subject: Re: Premature EOF and RemoteException Which suggestion from stackoverflow was that? The warning states there aren't enough data

Re: Premature EOF and RemoteException

2018-04-30 Thread David Quiroga
Which suggestion from stackoverflow was that? The warning states there aren't enough data nodes "could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation." How many data nodes should there be on the target? I

Re: Read or save specific blocks of a file

2018-04-30 Thread David Quiroga
There might be a better way... but I wonder if it might be possible to access the node where the block is store and read it from the local file system rather than from HDFS. On Mon, Apr 23, 2018 at 11:05 AM, Thodoris Zois wrote: > Hello list, > > I have a file on HDFS that is divided into 10 blo

Re: Yarn didn't read logs completely while app is running

2018-04-27 Thread Gour Saha
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds is your friend. Set it to 3600 and you will see logs of running apps being aggregated every hour. -Gour From: Soheil Pourbafrani Date: Friday, April 27, 2018 at 12:56 AM To: "user@hadoop.apache.org" Subject: Yarn didn't read lo

Re: How to configurate nodemanagers resources dynamically

2018-04-27 Thread Gour Saha
What OS do you have? For 2.8.3 you can set yarn.nodemanager.resource.detect-hardware-capabilities to true and then either don’t set any values for yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores or explicitly set both to -1. As a result, they will be automatically c

Re: hadoop-hdfs-2.x.y.jar & Hadoop versions compatibility

2018-04-26 Thread Christophe Jolif
Thanks Sean. This clarifies things. -- Christophe On Thu, Apr 26, 2018 at 5:55 PM, Sean Busbey wrote: > You ought to rely on the hadoop-client dependency (it's considered > downstream facing) instead of directly on hadoop-hdfs (it's internal facing > and the contents might change). > > Client -

Re: hadoop-hdfs-2.x.y.jar & Hadoop versions compatibility

2018-04-26 Thread Sean Busbey
You ought to rely on the hadoop-client dependency (it's considered downstream facing) instead of directly on hadoop-hdfs (it's internal facing and the contents might change). Client - Server compatibility is covered under our "Wire Compatibility" section of the compatibility docs: http://hadoop.a

Re: distcp from plain java program

2018-04-26 Thread Hendrik Haddorp
Hi Gour, I did but, the problem seems to have been the local execution. The local execution uses only one thread, which is what I saw in the logs as well. So I ended up just doing my own copy using hadoop FileSystem APIs and using multiple threads. That worked pretty well and allowed me to co

Re: distcp from plain java program

2018-04-25 Thread Gour Saha
Hendrik, Did you try setting maxMaps to a higher number? The default is 20, so you might try setting it to a higher value. -Gour On 4/21/18, 7:01 AM, "Hendrik Haddorp" wrote: Hi, I'm trying to use distcp (org.apache.hadoop.tools.DistCp) out of a simple java program to copy

Re: Regarding Docker container run_time setup

2018-04-20 Thread Shane Kumpf
Hi Ahmad, This looks to be a classpath related issue. IIUC, Hadoop is installed at /home/ubuntu/hadoop-3.1.0 on the host and at /usr/local/hadoop within the image/container? I noticed "ADD rm-hadoop-config/* $HADOOP_HOME/etc/hadoop/" in the Dockerfile. Is rm-hadoop-config a copy of the configurati

Re: Regarding Docker container run_time setup

2018-04-19 Thread SeyyedAhmad Javadi
Hi Shane, Sorry I got that I missed your suggestion in my last test and later updated to local/hadoop-ubuntu:latest and now I am working on the following error. I confirmed that mapred-site.xml within the contianer image has the information this error asking for. Do you think container is being l

Re: Regarding Docker container run_time setup

2018-04-19 Thread Shane Kumpf
Hello Ahmad, The image being used is not privileged/untrusted based on the settings in container-executor.cfg. In container-executor.cfg you have set docker.privileged-containers.registries=local, but the image name variable in the job is using "hadoop-ubuntu:latest". Based on that setting, YARN i

Re: Journal node edits directory

2018-04-19 Thread Francisco de Freitas
Hi Anu, thanks a lot for the tips. Much appreciated. I'll try to implement those changes. Regards, Francisco On Wed, 18 Apr 2018 at 18:56 Anu Engineer wrote: > I would start off by asking that Journal nodes be on separate machines, > maybe along with namenodes. > > If that is not possible, at

Re: Journal node edits directory

2018-04-18 Thread Anu Engineer
I would start off by asking that Journal nodes be on separate machines, maybe along with namenodes. If that is not possible, at least provide dedicated disks to journalnode process, that is not shared by your datanode process. >Is it expected to grow very large and/or needs to be in a separate p

Re: Which [open-souce] SQL engine atop Hadoop?

2018-04-15 Thread Samuel Marks
Hey cool, just found this one: http://trafodion.apache.org/ Samuel Marks http://linkedin.com/in/samuelmarks On Thu, Feb 5, 2015 at 8:39 PM, Azuryy Yu wrote: > please look at: > http://mail-archives.apache.org/mod_mbox/tajo-user/201502.mbox/browser > > > > On Tue, Jan 27, 2015 at 5:13 PM, Danie

Re: using Kerberos with certificate for authenticating Hadoop components instead of login/password keytabs

2018-04-06 Thread Dominique De Vito
Hi Antony, Thanks for you answer. > Though I have not used a certificate for authentication, I had used a 2FA based kerberos authentication. Instead of password , it was Pin and a token. Well, human-client authentication is one point, and thank you for confirming it runs with other authenticatio

Re: using Kerberos with certificate for authenticating Hadoop components instead of login/password keytabs

2018-04-05 Thread Benoy Antony
Hi Dominique, It should work. This is because the authentication mechanism (password or certificate) is between the client and KDC (kerberos server). Hadoop never knows about the password or certificate. The Hadoop servers receive a service ticket from the client. Client obtains service ticket f

Re: Error running 2.5.1 HDFS client on Java 10

2018-04-03 Thread Enrico Olivelli
Hi Akira, thank you for your reply. FYI I have upgraded to HDFS client 2.9.0 and all is running very well on JDK10. I am using only HDFS client, not the full stack Cheers Enrico 2018-04-02 17:15 GMT+02:00 Akira Ajisaka : > Hi Enrico, > > Now Java 10 is not supported in Apache Hadoop. > https:

Re: Error running 2.5.1 HDFS client on Java 10

2018-04-02 Thread Akira Ajisaka
Hi Enrico, Now Java 10 is not supported in Apache Hadoop. https://issues.apache.org/jira/browse/HADOOP-11423 Please use Java 8. Regards, Akira On 2018/03/23 22:22, Enrico Olivelli wrote: Hi, I am trying to move an application to Java 10 but I get this error. I can't find it in JIRA, has anyon

Re: How to compile and use my own MRAppMaster class

2018-04-01 Thread ruimeng...@aliyun.com
Dear Kwak: "BUILDING.txt" in root dir of hadoop source code dir. it contains some ways to build hadoop source code. I suggest you use the way of docker to build hadoop source code. According my experience, you needn't changing java classpath and hadoop configuration parameter, just replace

Re: How to change java.io.tmpdir for yarn jobs in hadoop 3 cluster

2018-03-23 Thread Michael Shtelma
Hi Vinod, Thanks for suggestion! I have used exactly these parameters. My problem was caused by strange behavior of the resource manager. It did not respond to start/stop commands, so the problem was, that the configuration changes were not used and seen by Yarn. I have killed resource manager pr

Re: How to change java.io.tmpdir for yarn jobs in hadoop 3 cluster

2018-03-22 Thread Vinod Kumar Vavilapalli
What are the directories you are seeing in /tmp/? You may be looking for the properties yarn.nodemanager.local-dirs and yarn.nodemanager.log-dirs. HTH +Vinod > On Mar 16, 2018, at 3:42 AM, Michael Shtelma wrote: > > Hi everybody, > > How can I change java.io.tmpdir folder for my YARN jobs

Re: Yarn mapreduce Logging : syslog vs stderr log files

2018-03-20 Thread Sultan Alamro
LOG.info(“text”) —> syslog > On Mar 20, 2018, at 9:02 PM, chandan prakash > wrote: > > Hi All, > Currently my yarn MR job is writing logs to syslog and stderr. > I want to know : > how it is decided which log will go to syslog and which will go to stderr ? > Can I redirect logs instead of go

Re: [EXTERNAL] Yarn : Limiting users to list only his applications

2018-03-20 Thread Benoy Antony
Yes, Thank you Harsh. That's what I was looking for. On Mon, Mar 19, 2018 at 8:49 PM, Harsh J wrote: > You are likely looking for the feature provided by YARN-7157. This will > work if you have YARN ACLs enabled. > > On Tue, Mar 20, 2018 at 3:37 AM Benoy Antony wrote: > >> Thanks Christopher. >

Re: Unable to download hadoop client 3.0 dependencies from mvn central

2018-03-19 Thread Binita Bharati
I got this resolved, there was an incorrect repository settings under my ${maven.home}/conf/settings.xml After removing the incorrect entry, mvn dependency update worked fine. -Binita On 19 March 2018 at 22:13, Binita Bharati wrote: > Hi, > > I have the following dependency added to my pom: >

Re: [EXTERNAL] Yarn : Limiting users to list only his applications

2018-03-19 Thread Harsh J
You are likely looking for the feature provided by YARN-7157. This will work if you have YARN ACLs enabled. On Tue, Mar 20, 2018 at 3:37 AM Benoy Antony wrote: > Thanks Christopher. > > > On Mon, Mar 19, 2018 at 2:23 PM, Christopher Weller < > christopher.wel...@gm.com> wrote: > >> Hi, >> >> >>

Re: [EXTERNAL] Yarn : Limiting users to list only his applications

2018-03-19 Thread Benoy Antony
Thanks Christopher. On Mon, Mar 19, 2018 at 2:23 PM, Christopher Weller < christopher.wel...@gm.com> wrote: > Hi, > > > >I am not aware of any way to restrict a user from listing the running > applications. > > > > Regards, > > > *Christopher * > > > > *From:* Benoy Antony [mailto:bant...@gm

RE: [EXTERNAL] Yarn : Limiting users to list only his applications

2018-03-19 Thread Christopher Weller
Hi, I am not aware of any way to restrict a user from listing the running applications. Regards, Christopher From: Benoy Antony [mailto:bant...@gmail.com] Sent: Monday, March 19, 2018 5:19 PM To: user Subject: [EXTERNAL] Yarn : Limiting users to list only his applications Hi , As far as

Re: map task attempt failed after 300

2018-03-15 Thread Naresh Dulam
these settings helped me resolving problem -Dmapreduce.task.timeout=360 -Dmapreduce.jobtracker.expire.trackers.interval=360 On Wed, Jan 17, 2018 at 8:18 PM, Naresh Dulam wrote: > I have sqoop job which is pulling data from huge database table by > filtering data. > I can't create a te

Re: Getting yarn node labels statistics

2018-03-01 Thread Sunil G
Hi Soheil There is one REST endpoint named "/label-mappings". This will provide nodelabel -> nodes mapping. http://rm-http-address:port/ws/v1/cluster/label-mappings Thanks Sunil On Thu, Mar 1, 2018 at 8:33 PM Soheil Pourbafrani wrote: > Is there any command or URL (like JMX) to get nodemanage

Re: Git tag policy

2018-02-26 Thread Lars Francke
Hi Chris, thanks! The tags are fine. It'd just be nice to have them consistent. All releases under rel/ would be great. And do you happen to know what DSBCR means? Cheers, Lars On Thu, Feb 22, 2018 at 9:16 PM, Chris Douglas wrote: > On T

Re: Hadoop Problem: Setup a Hadoop Multinode Cluster (2 Nodes)

2018-02-23 Thread Arpit Agarwal
Looks for errors in your DataNode log file. It’s in $HADOOP_HOME/logs by default. On Feb 23, 2018, at 12:55 AM, Butler, RD, Mnr <17647...@sun.ac.za> <17647...@sun.ac.za> wrote: To whom it may concern I have two computers, the one I work on (CENTOS installed) and a second computer (also CENT

Re: Git tag policy

2018-02-22 Thread Chris Douglas
On Tue, Feb 20, 2018 at 3:09 AM, Lars Francke wrote: > Is this intentional or just oversight/inconsistencies? The release candidate (RC) tags are created during votes. They can probably be cleaned up after the release is published. At a glance, rel/ looks correct. The hash should match the RC ta

Re: Kerberos impersonation question

2018-02-08 Thread Bear Giles
Oops - that solution was actually specific to a different part of our code and when I changed the latter it broke. Updating my earlier message for future Google searches. We're always calling UserGroupInformation.createProxy() when we're using user impersonation. If we do that then the three argum

Re: HDFS replication factor

2018-02-02 Thread रविशंकर नायर
This is solved in Hadoop 3. So stay tuned Best, On Feb 2, 2018 6:26 AM, "李立伟" wrote: > Hi: > It's my understanding that HDFS write operation is not considered > completd until all of the replicas have been successfully written.If so, > does the replication factor affect the write latency

Re: Regarding containers not launching

2018-02-02 Thread Eric Payne
Nishchay Malhotra, what scheduler are you using? Also, what are the settings for each queue? From: Billy Watson To: nishchay malhotra Cc: "common-u...@hadoop.apache.org" Sent: Tuesday, January 30, 2018 9:47 AM Subject: Re: Regarding containers not launching Is your j

Re: Re: performance about writing data to HDFS

2018-02-01 Thread 徐传印
IOException -原始邮件- 发件人:"Miklos Szegedi" 发送时间:2018-01-30 01:50:23 (星期二) 收件人: "徐传印" 抄送: Hdfs-dev , "Hadoop Common" , "common-u...@hadoop.apache.org" 主题: Re: performance about writing data to HDFS Hello, Here is an example. You can set an initial lo

Re: Kerberos impersonation question

2018-01-31 Thread Bear Giles
I figured it out. Of course it's obvious in retrospect. The tests passed after I added a call to user.setAuthMethod(KERBEROS) after createProxy(). I didn't need to do that with SIMPLE auth so I assumed the same would be true with Kerberos auth. The UGI's authentication method was set to PROXY but

Re: Regarding containers not launching

2018-01-31 Thread Billy Watson
Also, is there anything interesting in the yarn scheduler logs? Something about scheduling being skipped? On Wed, Jan 31, 2018 at 05:16 Billy Watson wrote: > Ok, and your container settings? > > On Wed, Jan 31, 2018 at 02:38 nishchay malhotra < > nishchay.malht...@gmail.com> wrote: > >> yes m

Re: Regarding containers not launching

2018-01-31 Thread Billy Watson
Ok, and your container settings? On Wed, Jan 31, 2018 at 02:38 nishchay malhotra wrote: > yes my job has about 160,000 maps and my cluster not getting fully > utilized around 6000 maps ran for 2 hrs and then I killed the job. At any > point of time only 40 containers are running thats just 11% o

Re: Regarding containers not launching

2018-01-30 Thread nishchay malhotra
yes my job has about 160,000 maps and my cluster not getting fully utilized around 6000 maps ran for 2 hrs and then I killed the job. At any point of time only 40 containers are running thats just 11% of my cluster capacity. { "classification": "mapred-site", "properties": { "mapredu

Re: Can't launch Flink on Hadoop cluster: Call From localhost.localdomain/127.0.0.1 to localhost:36063 failed

2018-01-30 Thread Julio Biason
Oh, forgot to mention: The application is loaded successfully: 2018-01-30 17:28:38,157 INFO org.apache.flink.yarn.YarnClusterDescriptor - Submitting application master application_1517332236216_0002 2018-01-30 17:28:38,582 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImp

Re: Regarding containers not launching

2018-01-30 Thread Billy Watson
Is your job able to use more containers, I.e. does your job have tasks waiting or are all tasks in progress? William Watson On Tue, Jan 30, 2018 at 1:56 AM, nishchay malhotra < nishchay.malht...@gmail.com> wrote: > What should I be looking for if my 24-node cluster in not launching enough > con

RE: HDFS latency and bandwidth/speed

2018-01-29 Thread Manuel Sopena Ballesteros
Thank you very much Anu, This is very useful Manuel From: Anu Engineer [mailto:aengin...@hortonworks.com] Sent: Tuesday, January 30, 2018 12:25 PM To: Manuel Sopena Ballesteros; user@hadoop.apache.org Subject: Re: HDFS latency and bandwidth/speed Hi Manuel, Depending on your use case: There

Re: HDFS latency and bandwidth/speed

2018-01-29 Thread Anu Engineer
Hi Manuel, Depending on your use case: There are several tools. Unfortunately, most of them need some familiarity with HDFS. Here is a quick set of links that google returns. https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/Benchmarking.html An old blog, but most of th

Re: performance about writing data to HDFS

2018-01-29 Thread Miklos Szegedi
Hello, Here is an example. You can set an initial low replication like this code does: https://github.com/apache/hadoop/blob/56feaa40bb94fcaa96ae668eebfabec4611928c0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-uploader/src/main/java/org/apache/hadoop/mapred/uploader/F

Re: Anyone has tried accessing TDE using HDFS Java APIs

2018-01-29 Thread praveenesh kumar
Hi Ajay Did you get any chance to look into this. Thanks Regards Prav On Fri, Jan 26, 2018 at 8:48 AM, praveenesh kumar wrote: > Hi Ajay > > We are using HDP 2.5.5 with HDFS 2.7.1.2.5 > > Thanks > Prav > > On Thu, Jan 25, 2018 at 5:47 PM, Ajay Kumar > wrote: > >> Hi Praveenesh, >> >> >> >> Wh

Re: Strange log on yarn commands

2018-01-27 Thread Soheil Pourbafrani
I use hadoop 2.7.5 On Sat, Jan 27, 2018 at 12:04 PM, Soheil Pourbafrani wrote: > Hi, I've set up YARN HA cluster with id rm1 and rm2, rm2 is active > resourcemanager > > when I run yarn commands on terminal like: > yarn node -list > or > yarn rmadmin -replaceLabelsOnNode "datanode1=online" > > i

Re: Kerberos auth + user impersonation

2018-01-26 Thread Bear Giles
The supergroup is 'supergroup'. The user 'snapuser' is in that group. I've added hadoop.proxyuser.snapuser.hosts, .groups, and .users to the conf file. (Via advanced options safety valve for core-site.xml in CDH manager.) I verified the change is in the deployed configuration. It works for SIMPL

Re: Kerberos auth + user impersonation

2018-01-26 Thread Jorge Machado
Have you added the proxy.***.hosts to hadoop config ? Check this: https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/Superusers.html Jorge Machado www.jmachado.me > On 26 Jan 201

Re: Kerberos auth + user impersonation

2018-01-26 Thread Bear Giles
Thanks all. I've made the changes but am still getting an error. Notably it's not a "user X cannot impersonate Y" error. exc: Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS] exc: at org.apache.hadoop.security.SaslRpcClient.select

Re: Anyone has tried accessing TDE using HDFS Java APIs

2018-01-26 Thread praveenesh kumar
Hi Ajay We are using HDP 2.5.5 with HDFS 2.7.1.2.5 Thanks Prav On Thu, Jan 25, 2018 at 5:47 PM, Ajay Kumar wrote: > Hi Praveenesh, > > > > What version of Hadoop you are using? > > > > Thanks, > > Ajay > > > > *From: *praveenesh kumar > *Date: *Thursday, January 25, 2018 at 8:22 AM > *To: *"u

Re: Anyone has tried accessing TDE using HDFS Java APIs

2018-01-25 Thread Ajay Kumar
Hi Praveenesh, What version of Hadoop you are using? Thanks, Ajay From: praveenesh kumar Date: Thursday, January 25, 2018 at 8:22 AM To: "user@hadoop.apache.org" Subject: Anyone has tried accessing TDE using HDFS Java APIs Hi We are trying to access TDE files using HDFS JAVA API. The user wh

Re: Kerberos auth + user impersonation

2018-01-25 Thread Wei-Chiu Chuang
Hi Near, Try setting proxyuser using with following doc: https://www.cloudera.com/documentation/enterprise/latest/topics/admin_hdfs_proxy_users.html A while ago I helped a customer of us to configure proxy user. If you have at-rest encryption in the cluster, you'd also need to configure KMS proxy

Re: Kerberos auth + user impersonation

2018-01-25 Thread Jorge Machado
Hi Bear, I have spend quite a time about this topics, actually if you just set the HADOOP_PROXY_USER and then just use loginUserFromKeytab or loginfromSubject it will create a proxy for you. have you set the hadoop.proxyuse..hosts ? that is important could be your error to. Jorge Machado ww

Re: Yarn StandBy Resourcemanager WebUi

2018-01-24 Thread Sunil G
Hi Soheil YARN web endpoint is designed in such a way that even if you land on standby RM's web endpoint, it will be redirected to active RM. - Sunil On Sun, Jan 21, 2018 at 3:03 AM Soheil Pourbafrani wrote: > I've configured Yarn cluster in high availability mode with standby > resource manage

Re: performance issue when using "hdfs setfacl -R"

2018-01-17 Thread Rushabh Shah
Try increasing heap size of the client via HADOOP_CLIENT_OPTS. The default is 128M IIRC This might improve the performance. You can bump it upto 1G. On Tue, Jan 16, 2018 at 10:03 PM, ping wang wrote: > Hi advisers, > We use "hdfs setfacl -R" for file ACL control. As the data directory is > big

Re: Using Kerberized Hadoop with two AD domains and ShuffleError

2018-01-11 Thread Michal Klempa
Hi, we were able to upgrade our auth_to_local from RULE:[1:$1@$0](.*@BRATISLAVA.TRIVIADATA.COM)s/@.*// to RULE:[1:$1@$0](.*@BRATISLAVA.TRIVIADATA.COM)s/(.*)/\1/L the former one is checking for a correct domain, but basically the substitution works same as DEFAULT - taking only the first componen

Re: Aws EMR Hadoop Web Access

2018-01-09 Thread Wei-Chiu Chuang
There's a project called Apache Knox that seems to offers what you need. https://hortonworks.com/apache/knox-gateway/ On Tue, Jan 9, 2018 at 2:20 PM, Jhon Anderson Cardenas Diaz < jhonderson2...@gmail.com> wrote: > According to aws documentation for EMR web access: > > > > *Setup Web Connection

Re: Hadoop 3.0 doesn't detect the correct conf folder

2018-01-05 Thread Allen Wittenauer
On 2017-12-21 00:25, Jeff Zhang wrote: > I tried the hadoop 3.0, and can start dfs properly, but when I start yarn, > it fails with the following error > ERROR: Cannot find configuration directory > "/Users/jzhang/Java/lib/hadoop-3.0.0/conf" > > Actually, this is not the correct conf folder. It

Re: Questions About Event Listener For Yarn RM

2018-01-05 Thread Herier chen
I see. Thanks for your explanation. On Fri, Jan 5, 2018 at 10:02 AM, Miklos Szegedi wrote: > Jessica, > > Thank you for raising this. Currently there is no event listener, you can > only poll the REST API. > Please feel free to open a jira for this. > > Best Regards, > Miklos > > > On Thu, Jan 4

Re: Questions About Event Listener For Yarn RM

2018-01-05 Thread Miklos Szegedi
Jessica, Thank you for raising this. Currently there is no event listener, you can only poll the REST API. Please feel free to open a jira for this. Best Regards, Miklos On Thu, Jan 4, 2018 at 3:41 PM, Herier chen wrote: > Hi Hadoop Experts, > > Happy New Years!! > > I know we have yarn RM re

Re: queries regarding hadoop DFS

2018-01-03 Thread Philippe Kernévez
Hi Sachin, On Mon, Dec 18, 2017 at 9:09 AM, Sachin Tiwari wrote: > Hi > > I am trying to use hadoop as distributed file storage system. > > I did a POC with a small cluster with 1 name-node and 4 datanodes and I > was able to get/put files using hdfs client and monitor the datanodes > status on:

Re: UserGroupInformation and Kerberos

2018-01-02 Thread Wei-Chiu Chuang
Hi Jorge, If you use Hadoop library as a client, and your first login using key is via UserGroupInformation#loginUserFromKeytab(), the client automatically relogins again using keytab when it gets an exception (see o.a.h.ipc.Client#handleSaslConnectionFailure). Note: using UserGroupInformation.lo

Re: Creating mapred-site.xml file

2017-12-28 Thread Miklos Szegedi
Gary, It is there in Hadoop 3.0 as well. It is just named simply as mapred-site.xml. # find . -name mapred-site* ./etc/hadoop/mapred-site.xml # find . -name mapred-site* | xargs cat Thank you, Miklos On Mon, Dec 25, 2017 at 2:04 PM, Gary Beckler wrote: > I am trying to create the map

RE: Help me understand hadoop caching behavior

2017-12-27 Thread Frank Luo
files (7 files in parallel) I get read speeds in excess of 75GB/s. Obviously this is DRAM speed, here’s the problem…each of the 4 nodes only has 32GB of RAM, and I’m asking Hadoop to re-read over 400GB of data. I am using the read back data, so it isn’t the compiler optimizing something out, because

Re: Help me understand hadoop caching behavior

2017-12-27 Thread Avery, John
n parallel) I get read speeds in excess of 75GB/s. Obviously this is DRAM speed, here’s the problem…each of the 4 nodes only has 32GB of RAM, and I’m asking Hadoop to re-read over 400GB of data. I am using the read back data, so it isn’t the compiler optimizing something out, because when I turn off

Re: getting pid for yarn container

2017-12-13 Thread Sebastian Nagel
Hi, I don't know whether there is a more efficient way, but this works: 1. get the host name or IP of the node the container is running 2. login to this node 3. grep the process table for the container ID (it's passed as argument to Java) $ ps aux | grep container_... yarn 16076 .../

Re: Hive - Json Serde - ORC

2017-12-06 Thread Wei-Chiu Chuang
Hi I think you are better off asking this question at the hive mailing list. Best On Wed, Dec 6, 2017 at 6:43 AM, kaducangica . wrote: > Hi all, > > i have a very complex json that i need to insert in a hive table. A json > example follws attached. > > First of all i read a json file with Spark

Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Susheel Kumar Gadalay
Check properties yarn.nodemanager.hostname, yarn.resourcemanager.hostname under yarn-site.xml. On 12/5/17, Alvaro Brandon wrote: > Thanks for your answer Vinay: > > The thing is that I'm using Marathon and not the Docker engine per se. I > don't want to set a -h parameter to each instance that i

Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Alvaro Brandon
Thanks for your answer Vinay: The thing is that I'm using Marathon and not the Docker engine per se. I don't want to set a -h parameter to each instance that is launched, since this is the responsibility of the container orchestrator platform. That's why I need an option like the HDFS one. Alvaro

Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Vinayakumar B
Hi Alvaro, I think you can configure to use custom hostname for docker containers as well. Hostname should be provided durin launch of containers using -h parameter. And with user created docker network DNS resolution of these hostnames among the containers is possible. provide --network-alias pa

Re: YARN Auth issue

2017-12-04 Thread Ravi Prakash
Hi Yaniv! The project looks interesting. Good luck! Sorry about the late reply. "HADOOP_HOME or hadoop.home.dir are not set." indicates that the environment variable has either not been set or the the directory doesn't have the requisite Hadoop binaries. You can connect a debugger to the NodeMana

Re: Local read-only users in ambari

2017-12-01 Thread Grant Overby
Sentry can provide restriction on Hive: https://cwiki.apache.org/confluence/display/SENTRY/Sentry+Tutorial HDFS follows the POSIX model for permissions. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html On Fri, Dec 1, 2017 at 12:38 PM, sidharth kumar

Re: Parameter repeated twice in hdfs-site.xml

2017-11-30 Thread Arpit Agarwal
That looks confusing, usability-wise. * A related question is how can I see the parameters with which a datanode was launched in order to check these values You can navigate to the conf servlet of the DataNode web UI e.g. http://w.x.y.z:50075/conf From: Alvaro Brandon Date: Thursday, No

Re: Hadoop DR setup for HDP 2.6

2017-11-16 Thread Susheel Kumar Gadalay
Thanks Sandeep for the info. Is it bundled with HDP or separate add on? Also is it open source or priced? Thanks SKG On 11/16/17, Sandeep Nemuri wrote: > You may want to check DLM which does DR ( > https://docs.hortonworks.com/HDPDocuments/DLM1/DLM-1.0.0/bk_dlm-administration/content/dlm_termin

Re: Hadoop DR setup for HDP 2.6

2017-11-16 Thread Sandeep Nemuri
You may want to check DLM which does DR ( https://docs.hortonworks.com/HDPDocuments/DLM1/DLM-1.0.0/bk_dlm-administration/content/dlm_terminology.html ) On Wed, Nov 15, 2017 at 10:43 PM, Susheel Kumar Gadalay wrote: > Hi, > > We have to setup DR for production Hadoop environment based on HDP 2.6.

Re: What does JobPriority mean?

2017-11-13 Thread Benson Qiu
Thanks, Sunil! I found your JIRA (YARN-1963 ) that has a really great design doc. On Mon, Nov 13, 2017 at 6:04 PM, Sunil G wrote: > Hi Benson, > > Prior to 2.8 releases, YARN did not support priorities for its > applications. Currently user can sp

Re: What does JobPriority mean?

2017-11-13 Thread Sunil G
Hi Benson, Prior to 2.8 releases, YARN did not support priorities for its applications. Currently user can specify priority (higher integer value means higher priority) to its applications so that high priority apps could get resources faster from scheduler (priority is applicable within a leaf qu

Re: PendingDeletionBlocks immediately after Namenode failover

2017-11-13 Thread Ravi Prakash
Hi Michael! Thank you for the report. I'm sorry I don't have advice other than the generic advice, like please try a newer version of Hadoop (say Hadoop-2.8.2) . You seem to already know that the BlockManager is the place to look. If you found it to be a legitimate issue which could affect Apache

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-13 Thread Bible, Landy
, Landy Cc: Alfonso Elizalde; Clay McDonald; Dr. Tibor Kurina; user@hadoop.apache.org Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 Hi Landy, Have you remember the hadoop distribution version you used to install on windows? Best wishes, Pavel On 12 November 2017 at 04:45, Bi

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-12 Thread Pavel Drankov
- > *From:* Alfonso Elizalde > *Sent:* Nov 11, 2017 14:16 > *To:* Clay McDonald > *Cc:* Dr. Tibor Kurina; Pavel Drankov; user@hadoop.apache.org > > *Subject:* Re: Problems installing Hadoop on Windows Server 2012 R2 > > What is windows? :) > > > > On Nov 11, 2017, at

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Bible, Landy
. -Landy Sent from Nine<http://www.9folders.com/> From: Alfonso Elizalde Sent: Nov 11, 2017 14:16 To: Clay McDonald Cc: Dr. Tibor Kurina; Pavel Drankov; user@hadoop.apache.org Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 What is windows?

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Alfonso Elizalde
ou trying to install Hadoop on the windows...? > > Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 > > From: Pavel Drankov<mailto:titant...@gmail.com> > Sent: Saturday, November 11, 2017 18:06 > Cc: user@hadoop.apache.org<mailto:user@

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Clay McDonald
Drankov<mailto:titant...@gmail.com> Sent: Saturday, November 11, 2017 18:06 Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 Hi, Why are you trying to run it on Winodws? It is not recommended. Best wishes, Pavel

RE: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Dr. Tibor Kurina
Exactly… 😊 Why, for the HELL, You trying to install Hadoop on the windows…? Sent from Mail for Windows 10 From: Pavel Drankov Sent: Saturday, November 11, 2017 18:06 Cc: user@hadoop.apache.org Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 Hi,  Why are you trying to run it

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Pavel Drankov
Hi, Why are you trying to run it on Winodws? It is not recommended. Best wishes, Pavel On 10 November 2017 at 04:44, Iván Galaviz wrote: > Hi, > > I'm having a lot of problems installing Hadoop on Windows Server 2012 R2, > > I'm currently trying to install it with these programs: > JDK 8u151 >

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-31 Thread Ravi Prakash
Hi Blaze! Thanks for the link, although it did not have anything I didn't already know. I'm afraid I don't quite follow what your concern is here. The files are protected using UNIX permissions on the worker nodes. Is that not what you are seeing? Are you using the LinuxContainerExecutor? Are the

Re: Unable to append to a file in HDFS

2017-10-31 Thread Ravi Prakash
HI Tarik! I'm glad you were able to diagnose your issue. Thanks for sharing with the user list. I suspect your writer may have set minimum replication to 3, and since you have only 2 datanodes, the Namenode will not allow you to successfully close the file. You could add another node or reduce the

RE: How to add new journal nodes without service downtime?

2017-10-31 Thread Fu, Yong
From Cloudera’s guide, there should have a downtime when moving Jounal Nodes: https://www.cloudera.com/documentation/enterprise/5-7-x/topics/admin_nn_migrate_roles.html#concept_w3h_m2l_2r And a ticket from Community about this problem which is still unresolved: https://issues.apache.org/jira/brows

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-30 Thread Blaze Spinnaker
Ravi, The code and architecture is based on the Hadoop source code submitted through the Yarn Client.This is an issue for map reduce as well. eg: https://pravinchavan.wordpress.com/2013/04/25/223/ On Mon, Oct 30, 2017 at 1:15 PM, Ravi Prakash wrote: > Hi Blaze! > > Thanks for digging into

Re: Unable to append to a file in HDFS

2017-10-30 Thread Tarik Courdy
Hello Ravi - I have pin pointed my issue a little more. When I create a file with a dfs.replication factor of 3 I can never append. However, if I create a file with a dfs.replication factor of 1 then I can append to the file all day long. Thanks again for your help regarding this. -Tarik On M

Re: Unable to append to a file in HDFS

2017-10-30 Thread Tarik Courdy
Hello Ravi - I greped the directory that has my logs and couldn't find any instance of "NameNode.complete". I just created a new file in hdfs using hdfs -touchz and it is allowing me to append to it with no problem. Not sure who is holding the eternal lease on my first file. Thanks again for yo

Re: Unable to append to a file in HDFS

2017-10-30 Thread Ravi Prakash
Hi Tarik! You're welcome! If you look at the namenode logs, do you see a "DIR* NameNode.complete: " message ? It should have been written when the first client called close(). Cheers Ravi On Mon, Oct 30, 2017 at 1:13 PM, Tarik Courdy wrote: > Hello Ravi - > > Thank you for your response. I h

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-30 Thread Ravi Prakash
Hi Blaze! Thanks for digging into this. I'm sure security related features could use more attention. Tokens for one user should be isolated from other users. I'm sorry I don't know how spark uses them. Would this question be more appropriate on the spark mailing list? https://spark.apache.org/com

<    4   5   6   7   8   9   10   11   12   13   >