Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Susheel Kumar Gadalay
Check properties yarn.nodemanager.hostname, yarn.resourcemanager.hostname under yarn-site.xml. On 12/5/17, Alvaro Brandon wrote: > Thanks for your answer Vinay: > > The thing is that I'm using Marathon and not the Docker engine per se. I > don't want to set a -h

Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Alvaro Brandon
Thanks for your answer Vinay: The thing is that I'm using Marathon and not the Docker engine per se. I don't want to set a -h parameter to each instance that is launched, since this is the responsibility of the container orchestrator platform. That's why I need an option like the HDFS one.

Re: Avoiding using hostname for YARN nodemanagers

2017-12-05 Thread Vinayakumar B
Hi Alvaro, I think you can configure to use custom hostname for docker containers as well. Hostname should be provided durin launch of containers using -h parameter. And with user created docker network DNS resolution of these hostnames among the containers is possible. provide --network-alias

Re: Local read-only users in ambari

2017-12-01 Thread Grant Overby
Sentry can provide restriction on Hive: https://cwiki.apache.org/confluence/display/SENTRY/Sentry+Tutorial HDFS follows the POSIX model for permissions. https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html On Fri, Dec 1, 2017 at 12:38 PM, sidharth

Re: Parameter repeated twice in hdfs-site.xml

2017-11-30 Thread Arpit Agarwal
That looks confusing, usability-wise. * A related question is how can I see the parameters with which a datanode was launched in order to check these values You can navigate to the conf servlet of the DataNode web UI e.g. http://w.x.y.z:50075/conf From: Alvaro Brandon

Re: Hadoop DR setup for HDP 2.6

2017-11-16 Thread Susheel Kumar Gadalay
Thanks Sandeep for the info. Is it bundled with HDP or separate add on? Also is it open source or priced? Thanks SKG On 11/16/17, Sandeep Nemuri wrote: > You may want to check DLM which does DR ( >

Re: Hadoop DR setup for HDP 2.6

2017-11-16 Thread Sandeep Nemuri
You may want to check DLM which does DR ( https://docs.hortonworks.com/HDPDocuments/DLM1/DLM-1.0.0/bk_dlm-administration/content/dlm_terminology.html ) On Wed, Nov 15, 2017 at 10:43 PM, Susheel Kumar Gadalay wrote: > Hi, > > We have to setup DR for production Hadoop

Re: What does JobPriority mean?

2017-11-13 Thread Benson Qiu
Thanks, Sunil! I found your JIRA (YARN-1963 ) that has a really great design doc. On Mon, Nov 13, 2017 at 6:04 PM, Sunil G wrote: > Hi Benson, > > Prior to 2.8 releases, YARN did not support priorities for its > applications.

Re: What does JobPriority mean?

2017-11-13 Thread Sunil G
Hi Benson, Prior to 2.8 releases, YARN did not support priorities for its applications. Currently user can specify priority (higher integer value means higher priority) to its applications so that high priority apps could get resources faster from scheduler (priority is applicable within a leaf

Re: PendingDeletionBlocks immediately after Namenode failover

2017-11-13 Thread Ravi Prakash
Hi Michael! Thank you for the report. I'm sorry I don't have advice other than the generic advice, like please try a newer version of Hadoop (say Hadoop-2.8.2) . You seem to already know that the BlockManager is the place to look. If you found it to be a legitimate issue which could affect

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-13 Thread Bible, Landy
:50 AM To: Bible, Landy Cc: Alfonso Elizalde; Clay McDonald; Dr. Tibor Kurina; user@hadoop.apache.org Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 Hi Landy, Have you remember the hadoop distribution version you used to install on windows? Best wishes, Pavel On 12 November 2017

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-12 Thread Pavel Drankov
> *From:* Alfonso Elizalde <elizalde.alfo...@gmail.com> > *Sent:* Nov 11, 2017 14:16 > *To:* Clay McDonald > *Cc:* Dr. Tibor Kurina; Pavel Drankov; user@hadoop.apache.org > > *Subject:* Re: Problems installing Hadoop on Windows Server 2012 R2 > >

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Bible, Landy
Sent from Nine<http://www.9folders.com/> From: Alfonso Elizalde <elizalde.alfo...@gmail.com> Sent: Nov 11, 2017 14:16 To: Clay McDonald Cc: Dr. Tibor Kurina; Pavel Drankov; user@hadoop.apache.org Subject: Re: Problems installing Hadoop on Windows Server 2

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Alfonso Elizalde
> wrote: > > Exactly... ? > Why, for the HELL, You trying to install Hadoop on the windows...? > > Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 > > From: Pavel Drankov<mailto:titant...@gmail.com> > Sent: Saturday, Novembe

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Clay McDonald
ws 10 From: Pavel Drankov<mailto:titant...@gmail.com> Sent: Saturday, November 11, 2017 18:06 Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 Hi, Why are you trying to run it on Winodws? It is not recommend

RE: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Dr. Tibor Kurina
Exactly…  Why, for the HELL, You trying to install Hadoop on the windows…? Sent from Mail for Windows 10 From: Pavel Drankov Sent: Saturday, November 11, 2017 18:06 Cc: user@hadoop.apache.org Subject: Re: Problems installing Hadoop on Windows Server 2012 R2 Hi,  Why are you trying to run

Re: Problems installing Hadoop on Windows Server 2012 R2

2017-11-11 Thread Pavel Drankov
Hi, Why are you trying to run it on Winodws? It is not recommended. Best wishes, Pavel On 10 November 2017 at 04:44, Iván Galaviz wrote: > Hi, > > I'm having a lot of problems installing Hadoop on Windows Server 2012 R2, > > I'm currently trying to install it with

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-31 Thread Ravi Prakash
Hi Blaze! Thanks for the link, although it did not have anything I didn't already know. I'm afraid I don't quite follow what your concern is here. The files are protected using UNIX permissions on the worker nodes. Is that not what you are seeing? Are you using the LinuxContainerExecutor? Are the

Re: Unable to append to a file in HDFS

2017-10-31 Thread Ravi Prakash
HI Tarik! I'm glad you were able to diagnose your issue. Thanks for sharing with the user list. I suspect your writer may have set minimum replication to 3, and since you have only 2 datanodes, the Namenode will not allow you to successfully close the file. You could add another node or reduce

RE: How to add new journal nodes without service downtime?

2017-10-31 Thread Fu, Yong
From Cloudera’s guide, there should have a downtime when moving Jounal Nodes: https://www.cloudera.com/documentation/enterprise/5-7-x/topics/admin_nn_migrate_roles.html#concept_w3h_m2l_2r And a ticket from Community about this problem which is still unresolved:

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-30 Thread Blaze Spinnaker
Ravi, The code and architecture is based on the Hadoop source code submitted through the Yarn Client.This is an issue for map reduce as well. eg: https://pravinchavan.wordpress.com/2013/04/25/223/ On Mon, Oct 30, 2017 at 1:15 PM, Ravi Prakash wrote: > Hi Blaze! > >

Re: Unable to append to a file in HDFS

2017-10-30 Thread Tarik Courdy
Hello Ravi - I have pin pointed my issue a little more. When I create a file with a dfs.replication factor of 3 I can never append. However, if I create a file with a dfs.replication factor of 1 then I can append to the file all day long. Thanks again for your help regarding this. -Tarik On

Re: Unable to append to a file in HDFS

2017-10-30 Thread Tarik Courdy
Hello Ravi - I greped the directory that has my logs and couldn't find any instance of "NameNode.complete". I just created a new file in hdfs using hdfs -touchz and it is allowing me to append to it with no problem. Not sure who is holding the eternal lease on my first file. Thanks again for

Re: Unable to append to a file in HDFS

2017-10-30 Thread Ravi Prakash
Hi Tarik! You're welcome! If you look at the namenode logs, do you see a "DIR* NameNode.complete: " message ? It should have been written when the first client called close(). Cheers Ravi On Mon, Oct 30, 2017 at 1:13 PM, Tarik Courdy wrote: > Hello Ravi - > > Thank

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-30 Thread Ravi Prakash
Hi Blaze! Thanks for digging into this. I'm sure security related features could use more attention. Tokens for one user should be isolated from other users. I'm sorry I don't know how spark uses them. Would this question be more appropriate on the spark mailing list?

Re: Unable to append to a file in HDFS

2017-10-30 Thread Ravi Prakash
Hi Tarik! The lease is owned by a client. If you launch 2 client programs, they will be viewed as separate (even though the user is same). Are you sure you closed the file when you first wrote it? Did the client program which wrote the file, exit cleanly? In any case, after the namenode lease

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

2017-10-30 Thread Blaze Spinnaker
I looked at this a bit more and I see a container_tokens file in spark directory. Does this contain the credentials where are added by addCredentials? Is this file accessible to the spark executors? It looks like just a clear text protobuf file.

Re:

2017-10-30 Thread Ravi Prakash
And one of the good things about open-source projects like Hadoop, you can read all about why :-) : https://issues.apache.org/jira/browse/HADOOP-4952 Enjoy! Ravi On Mon, Oct 30, 2017 at 11:54 AM, Ravi Prakash wrote: > Hi Doris! > > FileContext was created to overcome some

Re:

2017-10-30 Thread Ravi Prakash
Hi Doris! FileContext was created to overcome some of the limitations that we learned FileSystem had after a lot of experience. Unfortunately, a lot of code (i'm guessing maybe even the majority) still uses FileSystem. I suspect FileContext is probably the interface you want to use. HTH, Ravi

Re. ERROR: oz is not COMMAND nor fully qualified CLASSNAME

2017-10-17 Thread Nandakumar Vadivelu
Hi Margus, The commit (code version) you are using for building ozone is very old (Tue Nov 22 17:41:13 2016), can you do a “git pull” on HDFS-7240 branch and take a new build? The documentation you are referring is also very old one, currently the documentation work is happening as part of

Re: How to print values in console while running MapReduce application

2017-10-08 Thread Naganarasimha Garla
If it is for debugging purpose would advise to try Custom MR Counters ! Though you will not get it in console you will be get it from web ui for running job too. On Sun, Oct 8, 2017 at 9:24 PM, Harsh J wrote: > Consider running your job in the local mode (set config ' >

Re: How to print values in console while running MapReduce application

2017-10-08 Thread Harsh J
Consider running your job in the local mode (set config ' mapreduce.framework.name' to 'local'). Otherwise, rely on the log viewer from the (Job) History Server to check the console prints in each task (under the stdout or stderr sections). On Thu, 5 Oct 2017 at 05:15 Tanvir Rahman

Re: How to print values in console while running MapReduce application

2017-10-04 Thread Tanvir Rahman
Hello Demian, Thanks for the answer. 1. I am using Java for writing the MapReduce Application. Can you tell me how to do it in JAVA? 2. In the mapper or reducer function, which command did you use to write the output? Is it going to write it in Log folder? I have multiple nodes and

Re: How to print values in console while running MapReduce application

2017-10-04 Thread Demian Kurejwowski
i did the same tutorial, i think they only way  is doing it  outside hadoop.in the command line:cat folder/* | python mapper.py | sort | python reducer El Miércoles, 4 de octubre, 2017 16:20:31, Tanvir Rahman escribió: Hello,I have a small cluster and I am

Re: How to print values in console while running MapReduce application

2017-10-04 Thread Sultan Alamro
Hi, The easiest way is to open a new window and display the log file as follow tail -f /path/to/log/file.log Best, Sultan > On Oct 4, 2017, at 5:20 PM, Tanvir Rahman wrote: > > Hello, > I have a small cluster and I am running MapReduce WordCount application in > it.

Re: Will Backup Node download image and edits log from NameNode?

2017-09-30 Thread Gurmukh Singh
your observation is correct. backup node will also download. If you look at the journey/evolution of hadoop, we had primary, backup only, checkpointing node and then a generic secondary node. checking node will do the merge of fsimage and edits On 25/9/17 5:57 pm, Chang.Wu wrote: From the

Re: how to set a rpc timeout on yarn application.

2017-09-30 Thread Gurmukh Singh
Hi Can you explain me the job a bit, there are few rpc timeout like at datanode level, mapper timeouts etc On 28/9/17 1:47 pm, Demon King wrote: Hi,      We have finished a yarn application and deploy it to hadoop 2.6.0 cluster. But if one machine in cluster is down. Our application will

Re: hadoop questions for a begginer

2017-09-30 Thread Gurmukh Singh
Well, in actual job the input will be a file. so, instead of: echo "bla ble bli bla" | python mapper.py | sort -k1,1 | python reducer.py you will have: cat file.txt | python mapper.py | sort -k1,1 | python reducer.py The file has to be on HDFS (keeping simple, it can be other filesystems),

Re: Inter-cluster Communication

2017-09-27 Thread Madhav A
Rakesh, What sort of communication are you looking for between the clusters? I mean, Is it * At the data node level? * VPC inter-communication between 2 clusters ? * Data replication via custom tools? More details might better help in understanding what you're trying to accomplish. -Madhav

Re: Hadoop "managed" setup basic question (Ambari, CDH?)

2017-09-26 Thread Marton, Elek
If you would like to do it in a more dynamic way you can also you service registry/key-value stores. For example the configuration could be stored in Consul and the servers (namenode, datanode) could be started with consul-template (https://github.com/hashicorp/consul-template) In case of

Re: Hadoop "managed" setup basic question (Ambari, CDH?)

2017-09-22 Thread Sanel Zukan
Hi, For this amount of nodes, I'd go with automation tools like Ansible[1]/Puppet[2]/Rex[3]. They can install necessary packages, setup /etc/hosts and make per-node settings. Ansibles has a nice playbook (https://github.com/analytically/hadoop-ansible) you can start with and Puppet isn't short

Re: Is Hadoop validating the checksum when reading only a part of a file?

2017-09-20 Thread Ralph Soika
Thanks a lot for your answer. This makes it now clear to me and I expected that hadoop work in this way. === Ralph On 20.09.2017 07:57, Harsh J wrote: Yes, checksum match is checked for every form of read (unless explicitly disabled). By default, a checksum is generated and stored for every

Re: Is Hadoop validating the checksum when reading only a part of a file?

2017-09-19 Thread Harsh J
Yes, checksum match is checked for every form of read (unless explicitly disabled). By default, a checksum is generated and stored for every 512 bytes of data (io.bytes.per.checksum), so only the relevant parts are checked vs. the whole file when doing a partial read. On Mon, 18 Sep 2017 at 19:23

Re: Hadoop 2.8.0: Job console output suggesting non-existent rmserver 8088:proxy URI

2017-09-13 Thread Ravi Prakash
Hi Kevin! The ApplicationMaster doesn't really need any more configuration I think. Here's something to try out. Launch a very long mapreduce job: # A sleep job with 1 mapper and 1 reducer. (All the mapper and reducer do is sleep for the duration specified in -mt and -rt) yarn jar

Re: Hadoop 2.8.0: Job console output suggesting non-existent rmserver 8088:proxy URI

2017-09-12 Thread Kevin Buckley
On 9 September 2017 at 05:17, Ravi Prakash wrote: > I'm not sure my reply will be entirely helpful, but here goes. It sheds more light on things than I previously understood, Ravi, so cheers > The ResourceManager either proxies your request to the ApplicationMaster (if >

Re: Question about Resource Manager Rest APIs

2017-09-12 Thread Sunil G
Hi Jason, All data fetched from ResourceManager such as list of apps or reports etc are taken at current time (not cached). Do you expect some other data ? - Sunil On Tue, Sep 12, 2017 at 8:09 PM Xu,Jason wrote: > Hi all, > > > > I am trying to get information about the

Re: Apache ambari

2017-09-08 Thread Ravi Prakash
Hi Sidharth! The question seems relevant to the Ambari list : https://ambari.apache.org/mail-lists.html Cheers Ravi On Fri, Sep 8, 2017 at 1:15 AM, sidharth kumar wrote: > Hi, > > Apache ambari is open source. So,can we setup Apache ambari to manage > existing Apache

Re: Hadoop 2.8.0: Job console output suggesting non-existent rmserver 8088:proxy URI

2017-09-08 Thread Ravi Prakash
Hi Kevin! I'm not sure my reply will be entirely helpful, but here goes. The ResourceManager either proxies your request to the ApplicationMaster (if the application is running), or (once the application is finished) serves it itself if the job is in the "cache" (usually the last 1

Re: When is an hdfs-* service restart required?

2017-09-07 Thread Ravi Prakash
Hi Kellen! The first part of the configuration is a good indication of which service you need to restart. Unfortunately the only way to be completely sure is to read the codez. e.g. most hdfs configuration is mapped to variables in DFSConfigKeys $ find . -name *.java | grep -v test | xargs grep

Re: When is an hdfs-* service restart required?

2017-09-07 Thread Mingliang Liu
Restarting datanode(s) only is OK in this case. Thanks, > On Sep 7, 2017, at 10:46 AM, Kellen Arb wrote: > > Hello, > > I have a seemingly simple question, to which I can't find a clear answer. > > Which services/node-types must be restarted for each of the configuration

Re: Sqoop and kerberos ldap hadoop authentication

2017-09-07 Thread Wei-Chiu Chuang
Hi, The message "User xxx not found" feels more like group mapping error. Do you have the relevant logs? Integrating AD with Hadoop can be non-trivial, and Cloudera's general recommendation is to use third party authentication integrator like SSSD or Centrify, instead of using LdapGroupsMapping.

Re: Sqoop and kerberos ldap hadoop authentication

2017-09-07 Thread Rams Venkatesh
Yes it works. However this doesn't work with Microsoft SQL server Sent from my iPhone > On 7 Sep 2017, at 10:09, dna_29a wrote: > > Hi, > I want to run sqoop jobs under kerberos authentication. If I have a ticket > for local Kerberos user (local KDC and user exists as

Re: HDFS: Confused about "immutability" wrt overwrites

2017-09-07 Thread Philippe Kernévez
Hi, Immutability is about rewriting a file (random access). That is massively used by databases for example. On HDFS you can only append new data to file. HDFS have permission like a Posix File System, so you can remove the 'w' permisson on the file if you want to prevent deletion/overwrite. You

RE: Is Hadoop basically not suitable for a photo archive?

2017-09-06 Thread Zheng, Kai
he.org Subject: Re: Is Hadoop basically not suitable for a photo archive? I'm late to the party, and this isn't a hadoop solution, but apparently Cassandra is pretty good at this. https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593 On Wed,

Re: Is Hadoop basically not suitable for a photo archive?

2017-09-06 Thread Grant Overby
I'm late to the party, and this isn't a hadoop solution, but apparently Cassandra is pretty good at this. https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593 On Wed, Sep 6, 2017 at 2:48 PM, Ralph Soika wrote: > Hi

Re: Is Hadoop basically not suitable for a photo archive?

2017-09-06 Thread Ralph Soika
Hi I want to thank you all for your answers and your good ideas how to solve the hadoop "small-file-problem". Now I would like to briefly summarize your answers and suggested solutions. First of all I describe once again my general use case: * An external enterprise application need to

RE: Is Hadoop basically not suitable for a photo archive?

2017-09-06 Thread Muhamad Dimas Adiputro
I think mapR-fs is your solution. From: Anu Engineer [mailto:aengin...@hortonworks.com] Sent: Tuesday, September 05, 2017 10:33 PM To: Hayati Gonultas; Alexey Eremihin; Uwe Geercken Cc: Ralph Soika; user@hadoop.apache.org Subject: Re: Is Hadoop basically not suitable for a photo archive? Please

Re: Is Hadoop basically not suitable for a photo archive?

2017-09-05 Thread Anu Engineer
ken <uwe.geerc...@web.de<mailto:uwe.geerc...@web.de>> Cc: Ralph Soika <ralph.so...@imixs.com<mailto:ralph.so...@imixs.com>>, "user@hadoop.apache.org<mailto:user@hadoop.apache.org>" <user@hadoop.apache.org<mailto:user@hadoop.apache.org>> Subject: Re: Re: Is Had

Re: Re: Is Hadoop basically not suitable for a photo archive?

2017-09-04 Thread daemeon reiydelle
hayati.gonul...@gmail.com] > *Sent:* Tuesday, September 05, 2017 6:06 AM > *To:* Alexey Eremihin <a.eremi...@corp.badoo.com.invalid>; Uwe Geercken < > uwe.geerc...@web.de> > *Cc:* Ralph Soika <ralph.so...@imixs.com>; user@hadoop.apache.org > *Subject:* Re: Re: Is Hadoop basically not

RE: Re: Is Hadoop basically not suitable for a photo archive?

2017-09-04 Thread Zheng, Kai
;; Uwe Geercken <uwe.geerc...@web.de> Cc: Ralph Soika <ralph.so...@imixs.com>; user@hadoop.apache.org Subject: Re: Re: Is Hadoop basically not suitable for a photo archive? I would recommend an object store such as openstack swift as another option. On Mon, Sep 4, 2017 at 1:09 PM Uwe Ge

Re: Re: Is Hadoop basically not suitable for a photo archive?

2017-09-04 Thread Hayati Gonultas
acceptable and and easy way to go for. > > Cheers, > > Uwe > > *Gesendet:* Montag, 04. September 2017 um 21:32 Uhr > *Von:* "Alexey Eremihin" <a.eremi...@corp.badoo.com.INVALID> > *An:* "Ralph Soika" <ralph.so...@imixs.com> > *Cc:* "u

Aw: Re: Is Hadoop basically not suitable for a photo archive?

2017-09-04 Thread Uwe Geercken
alph Soika" <ralph.so...@imixs.com> Cc: "user@hadoop.apache.org" <user@hadoop.apache.org> Betreff: Re: Is Hadoop basically not suitable for a photo archive? Hi Ralph,  In general Hadoop is able to store such data. And even Har archives can be used with conjunction with

Re: Is Hadoop basically not suitable for a photo archive?

2017-09-04 Thread Alexey Eremihin
Hi Ralph, In general Hadoop is able to store such data. And even Har archives can be used with conjunction with WebHDFS (by passing offset and limit attributes). What are your reading requirements? FS meta data are not distributed and reading the data is limited by the HDFS NameNode server

Re: Region assignment on restart

2017-09-02 Thread Rob Verkuylen
Sorry this was meant for hbase. Copy/paste error. Will post there. On Sat, Sep 2, 2017 at 10:10 AM, Rob Verkuylen wrote: > On CDH5.12 with HBase 1.2, I'm experiencing an issue I thought was long > solved. The regions are all assigned to a single regionserver on a restart >

Re: Mapreduce example from library isuue

2017-09-01 Thread Atul Rajan
Hello Akira, yes thanks for the solution i checked some classpath was missing from yarn and mapred site. adding those files it resolved the issue and mapreduce ran smoothly. thanks thanks for the article Thanks and Regards Atul Rajan -Sent from my iPhone On 02-Sep-2017, at 1:17 AM, Akira

Re: Representing hadoop metrics on ganglia web interface

2017-09-01 Thread Akira Ajisaka
Hi Nishant, Multicast is used to communicate between Ganglia daemons by default and it is banned in AWS EC2. Would you try unicast setting? Regards, Akira On 2017/08/04 12:37, Nishant Verma wrote: Hello We are supposed to collect hadoop metrics and see the cluster health and performance.

Re: Prime cause of NotEnoughReplicasException

2017-09-01 Thread Akira Ajisaka
Hi Nishant, The debug message shows there are not enough racks configured to satisfy the rack awareness. http://hadoop.apache.org/docs/r3.0.0-alpha4/hadoop-project-dist/hadoop-common/RackAwareness.html If you don't need to place replicas in different racks, you can simply ignore the debug

Re: spark on yarn error -- Please help

2017-09-01 Thread Akira Ajisaka
Hi sidharth, Would you ask Spark related question to the user mailing list of Apache Spark? https://spark.apache.org/community.html Regards, Akira On 2017/08/28 11:49, sidharth kumar wrote: Hi, I have configured apace spark over yarn. I am able to run map reduce job successfully but

Re: Mapreduce example from library isuue

2017-09-01 Thread Akira Ajisaka
Hi Atul, Have you added HADOOP_MAPRED_HOME to yarn.nodemanager.env-whitelist in yarn-site.xml? The document may help: http://hadoop.apache.org/docs/r3.0.0-alpha4/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_a_Single_Node Regards, Akira On 2017/08/29 17:45, Atul Rajan wrote:

Re: YARN - How is a node for a container determined?

2017-08-29 Thread Grant Overby
Most of the applications are twill apps and are some what long running, but not perpetual, a few hours to a day. Many of the apps (say about half) have a lot of idle time. These apps come from across the enterprise, Idk why they're idle. There are also a few MR, TEZ, and Spark apps in the mix. If

Re: unsubscribe

2017-08-29 Thread Ravi Prakash
Hi Corne! Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on https://hadoop.apache.org/mailing_lists.html Thanks On Sun, Aug 27, 2017 at 10:25 PM, Corne Van Rensburg wrote: > [image: Softsure] > > unsubscribe > > > > *Corne Van RensburgManaging

Re:

2017-08-29 Thread Ravi Prakash
Hi Dominique, Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on https://hadoop.apache.org/mailing_lists.html Thanks Ravi 2017-08-26 10:49 GMT-07:00 Dominique Rozenberg : > unsubscribe > > > > > > [image: cid:image001.jpg@01D10A65.E830C520] > >

Re: Recommendation for Resourcemanager GC configuration

2017-08-29 Thread Ravuri, Venkata Puneet
rakash <ravihad...@gmail.com> Cc: "Ravuri, Venkata Puneet" <vrav...@ea.com>, "common-u...@hadoop.apache.org" <user@hadoop.apache.org> Subject: Re: Recommendation for Resourcemanager GC configuration Hi Puneet, Along with the heap dump details, I would also like t

Re: Recommendation for Resourcemanager GC configuration

2017-08-29 Thread Ravuri, Venkata Puneet
.com> Cc: "common-u...@hadoop.apache.org" <user@hadoop.apache.org> Subject: Re: Recommendation for Resourcemanager GC configuration What is the ResourceManager JVM’s heap size? What is the value for the configuration yarn.resourcemanager.max-completed-applications? +Vinod On Aug 23, 2017, at

Re: File copy from local to hdfs error

2017-08-29 Thread Atul Rajan
Hello Istavan, Thanks for the help it worked finally There was firewall issue solving that part made the hdfs work and take entry from local file system. Thanks and Regards Atul Rajan -Sent from my iPhone On 28-Aug-2017, at 11:20 PM, István Fajth wrote: Hi Atul, as

Re: YARN - How is a node for a container determined?

2017-08-29 Thread Philippe Kernévez
" densely pack containers on fewer nodes" : quite surprising, +1 with Daemon You have Yarn labels that can be used for that. Classical example are the need of specific hardware fir some processing. https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html Regards,

Re: File copy from local to hdfs error

2017-08-28 Thread István Fajth
Hi Atul, as suggested before, set the blockmanager log level to debug, and check logs for reasons. You can either set the whole NameNode log to DEBUG level, and see for the messages logged by the BlockManager. Around the INFO level message in the NameNode log similar to the message you see now in

Re: File copy from local to hdfs error

2017-08-28 Thread Atul Rajan
DataNodes were having issue earlier, i added the ports required in the iptables after that data logs are running but HDFS not able to distribute the file and make blocks. and any file copied on the cluster is throwing this error. On 28 August 2017 at 21:46, István Fajth wrote:

Re: File copy from local to hdfs error

2017-08-28 Thread István Fajth
Hi Atul, you can check NameNode logs if the DataNodes were in service or there were issues with them. As well you can check for BlockManager's debug level logs for more exact reasons if you can reproduce the issue at will. Istvan On Aug 28, 2017 17:56, "Atul Rajan"

Re: UNSUBSCRIBE

2017-08-28 Thread Shan Huasong
Please UNSUBSCRIBE too! On Mon, Aug 28, 2017 at 1:25 AM, Corne Van Rensburg wrote: > [image: Softsure] > > UNSUBSCRIBE > > > > *Corne Van RensburgManaging Director Softsure* > [image: Tel] 044 805 3746 > [image: Fax] > [image: Email] co...@softsure.co.za > *Softsure (Pty)

RE: Namenode not able to come out of SAFEMODE

2017-08-28 Thread omprakash
om>; user@hadoop.apache.org Subject: Re: Namenode not able to come out of SAFEMODE Hi Om, Although you solved this issue by bumping up the ipc max length which is by default set to 64MB. $ hdfs getconf -confkey ipc.maximum.data.length 67108864 So, it means the disk you are using is having

Re: Namenode not able to come out of SAFEMODE

2017-08-25 Thread Gurmukh Singh
.in>; user@hadoop.apache.org *Subject:* RE: Namenode not able to come out of SAFEMODE Hi Omprakash, The reported blocks 0 needs additional 6132675 blocks to reach the threshold 0.9990 of total blocks 6138814. The number of *live datanodes 0* has reached the minimum number 0. --->

Re: Data streamer java exception

2017-08-24 Thread surendra lilhore
Hi, I suggest you use shell command for accessing cluster info instead of curl command. For hdfs shell command you can refer https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html For yarn shell command you can refer

Re: Data streamer java exception

2017-08-24 Thread Atul Rajan
Hello Team, I come to resolution of this issue by allowing the iproute table entry for the specific ports used for namenode as well as datanode. now hdfs is running and cluster is running. thanks a lot for the suggestion. now i have another issue of interface as i am running console view of RHEL

RE: Data streamer java exception

2017-08-24 Thread surendra lilhore
Hi Atul, Please can you share the datanode exception logs ?. Check if namenode and datanode hostname mapping is proper or not in /etc/hosts. Put operation is failing because datanode’s are not connected to the namenode. -Surendra From: Atul Rajan [mailto:atul.raja...@gmail.com] Sent: 24

Re: Recommendation for Resourcemanager GC configuration

2017-08-23 Thread Vinod Kumar Vavilapalli
What is the ResourceManager JVM’s heap size? What is the value for the configuration yarn.resourcemanager.max-completed-applications? +Vinod > On Aug 23, 2017, at 9:23 AM, Ravuri, Venkata Puneet wrote: > > Hello, > > I wanted to know if there is any recommendation for

Re: Recommendation for Resourcemanager GC configuration

2017-08-23 Thread Naganarasimha Garla
Hi Puneet, Along with the heap dump details, I would also like to know the version of the Hadoop-Yarn being used, size of the cluster, all Memory configurations, and JRE version. Also if possible can you share the rational behind the choice for Parallel GC collector over others (CMS or G1) ?

Re: Recommendation for Resourcemanager GC configuration

2017-08-23 Thread Ravi Prakash
Hi Puneet Can you take a heap dump and see where most of the churn is? Is it lots of small applications / few really large applications with small containers etc. ? Cheers Ravi On Wed, Aug 23, 2017 at 9:23 AM, Ravuri, Venkata Puneet wrote: > Hello, > > > > I wanted to know if

Re: Some Configs in hdfs-default.xml

2017-08-23 Thread Ravi Prakash
-interval : This is more a function of how busy your datanodes are (sometimes they are too busy to heartbeat) and how robust is your network (dropping heartbeat packets). It doesn't really take too long to *check* the last heartbeat time of datanodes, but its a lot of work to order re-replications, so I

Re: JVM OPTS about HDFS

2017-08-18 Thread Akash Mishra
I am currently supporting Single Name Service is HA, Based on QJM with 0.9 PT data with 55-58 Million Object [ files + Blocks ] with 36G of JVM heap with G1GC. I would recommend starting with 16G and scale depending on your blocks with G1GC Garbage collection. Thanks, On Fri, Aug 18, 2017 at

Re: JVM OPTS about HDFS

2017-08-18 Thread Gurmukh Singh
400GB as heap space for Namenode is bit high. The GC pause time will be very high. For a cluster with about 6PB, approx 20GB is decent memory. As you mentioned it is HA, so it is safe to assume that the fsimage is check pointed at regular intervals and we do not need to worry during a manual

Re: Restoring Data to HDFS with distcp from standard input /dev/stdin

2017-08-16 Thread Ravi Prakash
Hi Heitor! Welcome to the Hadoop community. Think of the "hadoop distcp" command as a script which launches other JAVA programs on the Hadoop worker nodes. The script collects the list of sources, divides it among the several worker nodes and waits for the worker nodes to actually do the copying

Re: Forcing a file to update its length

2017-08-09 Thread Harsh J
eer* > > [image: cid:image004.png@01D19182.F24CA3E0] > > > > *From:* Harsh J [mailto:ha...@cloudera.com] > *Sent:* Wednesday, August 9, 2017 3:01 PM > *To:* David Robison <david.robi...@psgglobal.net>; user@hadoop.apache.org > *Subject:* Re: Forcing a file to

RE: Forcing a file to update its length

2017-08-09 Thread David Robison
obi...@psgglobal.net>; user@hadoop.apache.org Subject: Re: Forcing a file to update its length I don't think it'd be safe for a reader to force an update of length at the replica locations directly. Only the writer would be perfectly aware of the DNs in use for the replicas and their

Re: Forcing a file to update its length

2017-08-09 Thread Harsh J
I don't think it'd be safe for a reader to force an update of length at the replica locations directly. Only the writer would be perfectly aware of the DNs in use for the replicas and their states, and the precise count of bytes entirely flushed out of the local buffer. Thereby only the writer is

Re: Forcing a file to update its length

2017-08-09 Thread Ravi Prakash
Hi David! A FileSystem class is an abstraction for the file system. It doesn't make sense to do an hsync on a file system (should the file system sync all files currently open / just the user's etc.) . With appropriate flags maybe you can make it make sense, but we don't have that functionality.

Re: modify the MapTask.java but no change

2017-08-07 Thread Ravi Prakash
, duanyu teng <dyteng.x...@gmail.com> wrote: > Hi, > > I modify the MapTask.java file in order to output more log information. I > re-compile the file and deploy the jar to the whole clusters, but I found > that the output log has not changed, I don't know why. >

Re: modify the MapTask.java but no change

2017-08-07 Thread Edwina Lu
.apache.org" <user@hadoop.apache.org> Subject: modify the MapTask.java but no change Hi, I modify the MapTask.java file in order to output more log information. I re-compile the file and deploy the jar to the whole clusters, but I found that the output log has not changed, I don't know why.

Re: Hadoop 2.8.0: Use of container-executor.cfg to restrict access to MapReduce jobs

2017-08-07 Thread Varun Vasudev
Hi Kevin, The check that’s carried out is the following(pseudo-code) - If(user_id < min_user_id && user_not_in_allowed_system_users) { return “user banned”; } If(user_in_banned_users_list) { return “user banned”; } In your case, you can either bump up the min user id to a higher number

Re: Kerberised JobHistory Server not starting: User jhs trying to create the /mr-history/done directory

2017-08-06 Thread Kevin Buckley
On 25 July 2017 at 03:21, Erik Krogen wrote: > Hey Kevin, > > Sorry, I missed your point about using auth_to_local. You're right that you > should be able to use that for what you're trying to achieve. I think it's > just that your rule is wrong; I believe it should be: >

<    5   6   7   8   9   10   11   12   13   14   >