Hi Tarik!
The lease is owned by a client. If you launch 2 client programs, they will
be viewed as separate (even though the user is same). Are you sure you
closed the file when you first wrote it? Did the client program which wrote
the file, exit cleanly? In any case, after the namenode lease hard
I looked at this a bit more and I see a container_tokens file in spark
directory. Does this contain the credentials where are added by
addCredentials? Is this file accessible to the spark executors?
It looks like just a clear text protobuf file.
https://github.com/apache/hadoop/blob/82cb2a649
And one of the good things about open-source projects like Hadoop, you can
read all about why :-) : https://issues.apache.org/jira/browse/HADOOP-4952
Enjoy!
Ravi
On Mon, Oct 30, 2017 at 11:54 AM, Ravi Prakash wrote:
> Hi Doris!
>
> FileContext was created to overcome some of the limitations tha
Hi Doris!
FileContext was created to overcome some of the limitations that we learned
FileSystem had after a lot of experience. Unfortunately, a lot of code (i'm
guessing maybe even the majority) still uses FileSystem.
I suspect FileContext is probably the interface you want to use.
HTH,
Ravi
O
Hi Margus,
The commit (code version) you are using for building ozone is very old (Tue Nov
22 17:41:13 2016), can you do a “git pull” on HDFS-7240 branch and take a new
build?
The documentation you are referring is also very old one, currently the
documentation work is happening as part of HDF
If it is for debugging purpose would advise to try Custom MR Counters !
Though you will not get it in console you will be get it from web ui for
running job too.
On Sun, Oct 8, 2017 at 9:24 PM, Harsh J wrote:
> Consider running your job in the local mode (set config '
> mapreduce.framework.name'
Consider running your job in the local mode (set config '
mapreduce.framework.name' to 'local'). Otherwise, rely on the log viewer
from the (Job) History Server to check the console prints in each task
(under the stdout or stderr sections).
On Thu, 5 Oct 2017 at 05:15 Tanvir Rahman wrote:
> Hell
Hello Demian,
Thanks for the answer.
1. I am using Java for writing the MapReduce Application. Can you tell
me how to do it in JAVA?
2. In the mapper or reducer function, which command did you use to write
the output? Is it going to write it in Log folder? I have multiple nodes
and
i did the same tutorial, i think they only way is doing it outside hadoop.in
the command line:cat folder/* | python mapper.py | sort | python reducer
El Miércoles, 4 de octubre, 2017 16:20:31, Tanvir Rahman
escribió:
Hello,I have a small cluster and I am running MapReduce WordCount a
Hi,
The easiest way is to open a new window and display the log file as follow
tail -f /path/to/log/file.log
Best,
Sultan
> On Oct 4, 2017, at 5:20 PM, Tanvir Rahman wrote:
>
> Hello,
> I have a small cluster and I am running MapReduce WordCount application in
> it.
> I want to print some va
your observation is correct. backup node will also download.
If you look at the journey/evolution of hadoop, we had primary, backup
only, checkpointing node and then a generic secondary node.
checking node will do the merge of fsimage and edits
On 25/9/17 5:57 pm, Chang.Wu wrote:
From the
Hi
Can you explain me the job a bit, there are few rpc timeout like at
datanode level, mapper timeouts etc
On 28/9/17 1:47 pm, Demon King wrote:
Hi,
We have finished a yarn application and deploy it to hadoop 2.6.0
cluster. But if one machine in cluster is down. Our application will
h
Well, in actual job the input will be a file.
so, instead of:
echo "bla ble bli bla" | python mapper.py | sort -k1,1 | python reducer.py
you will have:
cat file.txt | python mapper.py | sort -k1,1 | python reducer.py
The file has to be on HDFS (keeping simple, it can be other
filesystems), t
Rakesh, What sort of communication are you looking for between the
clusters?
I mean, Is it
* At the data node level?
* VPC inter-communication between 2 clusters ?
* Data replication via custom tools?
More details might better help in understanding what you're trying to
accomplish.
-Madhav
On
If you would like to do it in a more dynamic way you can also you
service registry/key-value stores.
For example the configuration could be stored in Consul and the servers
(namenode, datanode) could be started with consul-template
(https://github.com/hashicorp/consul-template)
In case of
Hi,
For this amount of nodes, I'd go with automation tools like
Ansible[1]/Puppet[2]/Rex[3]. They can install necessary packages, setup
/etc/hosts and make per-node settings.
Ansibles has a nice playbook
(https://github.com/analytically/hadoop-ansible) you can start with and
Puppet isn't short ei
Thanks a lot for your answer. This makes it now clear to me and I
expected that hadoop work in this way.
===
Ralph
On 20.09.2017 07:57, Harsh J wrote:
Yes, checksum match is checked for every form of read (unless
explicitly disabled). By default, a checksum is generated and stored
for every
Yes, checksum match is checked for every form of read (unless explicitly
disabled). By default, a checksum is generated and stored for every 512
bytes of data (io.bytes.per.checksum), so only the relevant parts are
checked vs. the whole file when doing a partial read.
On Mon, 18 Sep 2017 at 19:23
Hi Kevin!
The ApplicationMaster doesn't really need any more configuration I think.
Here's something to try out. Launch a very long mapreduce job:
# A sleep job with 1 mapper and 1 reducer. (All the mapper and reducer do
is sleep for the duration specified in -mt and -rt)
yarn jar
$HADOOP_HOME/s
On 9 September 2017 at 05:17, Ravi Prakash wrote:
> I'm not sure my reply will be entirely helpful, but here goes.
It sheds more light on things than I previously understood, Ravi, so cheers
> The ResourceManager either proxies your request to the ApplicationMaster (if
> the application is runn
Hi Jason,
All data fetched from ResourceManager such as list of apps or reports etc
are taken at current time (not cached). Do you expect some other data ?
- Sunil
On Tue, Sep 12, 2017 at 8:09 PM Xu,Jason wrote:
> Hi all,
>
>
>
> I am trying to get information about the cluster via Resource M
Hi Sidharth!
The question seems relevant to the Ambari list :
https://ambari.apache.org/mail-lists.html
Cheers
Ravi
On Fri, Sep 8, 2017 at 1:15 AM, sidharth kumar
wrote:
> Hi,
>
> Apache ambari is open source. So,can we setup Apache ambari to manage
> existing Apache Hadoop cluster ?
>
> Warm
Hi Kevin!
I'm not sure my reply will be entirely helpful, but here goes.
The ResourceManager either proxies your request to the ApplicationMaster
(if the application is running), or (once the application is finished)
serves it itself if the job is in the "cache" (usually the last 1
applicatio
Hi Kellen!
The first part of the configuration is a good indication of which service
you need to restart. Unfortunately the only way to be completely sure is to
read the codez. e.g. most hdfs configuration is mapped to variables in
DFSConfigKeys
$ find . -name *.java | grep -v test | xargs grep
"
Restarting datanode(s) only is OK in this case.
Thanks,
> On Sep 7, 2017, at 10:46 AM, Kellen Arb wrote:
>
> Hello,
>
> I have a seemingly simple question, to which I can't find a clear answer.
>
> Which services/node-types must be restarted for each of the configuration
> properties? For exa
Hi,
The message "User xxx not found" feels more like group mapping error. Do
you have the relevant logs?
Integrating AD with Hadoop can be non-trivial, and Cloudera's general
recommendation is to use third party authentication integrator like SSSD or
Centrify, instead of using LdapGroupsMapping.
Yes it works. However this doesn't work with Microsoft SQL server
Sent from my iPhone
> On 7 Sep 2017, at 10:09, dna_29a wrote:
>
> Hi,
> I want to run sqoop jobs under kerberos authentication. If I have a ticket
> for local Kerberos user (local KDC and user exists as linux user on each
> hos
Hi,
Immutability is about rewriting a file (random access). That is massively
used by databases for example.
On HDFS you can only append new data to file.
HDFS have permission like a Posix File System, so you can remove the 'w'
permisson on the file if you want to prevent deletion/overwrite.
You
Looks like HBase MOB should be mentioned, since the feature was definitely
introduced with photo files/objects in mind.
Regards,
Kai
From: Grant Overby [mailto:grant.ove...@gmail.com]
Sent: Thursday, September 07, 2017 3:05 AM
To: Ralph Soika
Cc: user@hadoop.apache.org
Subject: Re: Is Hadoop
I'm late to the party, and this isn't a hadoop solution, but apparently
Cassandra is pretty good at this.
https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593
On Wed, Sep 6, 2017 at 2:48 PM, Ralph Soika wrote:
> Hi
>
> I want to thank you
Hi
I want to thank you all for your answers and your good ideas how to
solve the hadoop "small-file-problem".
Now I would like to briefly summarize your answers and suggested
solutions. First of all I describe once again my general use case:
* An external enterprise application need to sto
I think mapR-fs is your solution.
From: Anu Engineer [mailto:aengin...@hortonworks.com]
Sent: Tuesday, September 05, 2017 10:33 PM
To: Hayati Gonultas; Alexey Eremihin; Uwe Geercken
Cc: Ralph Soika; user@hadoop.apache.org
Subject: Re: Is Hadoop basically not suitable for a photo archive?
Please
mixs.com>>,
"user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
mailto:user@hadoop.apache.org>>
Subject: Re: Re: Is Hadoop basically not suitable for a photo archive?
I would recommend an object store such as openstack swift as another option.
On Mon, Sep 4, 2017 at 1:09 PM Uw
sday, September 05, 2017 6:06 AM
> *To:* Alexey Eremihin ; Uwe Geercken <
> uwe.geerc...@web.de>
> *Cc:* Ralph Soika ; user@hadoop.apache.org
> *Subject:* Re: Re: Is Hadoop basically not suitable for a photo archive?
>
>
>
> I would recommend an object store such as opens
@hadoop.apache.org
Subject: Re: Re: Is Hadoop basically not suitable for a photo archive?
I would recommend an object store such as openstack swift as another option.
On Mon, Sep 4, 2017 at 1:09 PM Uwe Geercken
mailto:uwe.geerc...@web.de>> wrote:
just my two cents:
Maybe you can use hadoop for s
y to go for.
>
> Cheers,
>
> Uwe
>
> *Gesendet:* Montag, 04. September 2017 um 21:32 Uhr
> *Von:* "Alexey Eremihin"
> *An:* "Ralph Soika"
> *Cc:* "user@hadoop.apache.org"
> *Betreff:* Re: Is Hadoop basically not suitable for a photo archi
time span.
Yes it would be a duplication, but maybe - without knowing all the details - that would be acceptable and and easy way to go for.
Cheers,
Uwe
Gesendet: Montag, 04. September 2017 um 21:32 Uhr
Von: "Alexey Eremihin"
An: "Ralph Soika"
Cc: "user@hadoo
Hi Ralph,
In general Hadoop is able to store such data. And even Har archives can be
used with conjunction with WebHDFS (by passing offset and limit
attributes). What are your reading requirements? FS meta data are not
distributed and reading the data is limited by the HDFS NameNode server
performa
Sorry this was meant for hbase. Copy/paste error. Will post there.
On Sat, Sep 2, 2017 at 10:10 AM, Rob Verkuylen wrote:
> On CDH5.12 with HBase 1.2, I'm experiencing an issue I thought was long
> solved. The regions are all assigned to a single regionserver on a restart
> of hbase though cloude
Hello Akira,
yes thanks for the solution i checked some classpath was missing from yarn and
mapred site. adding those files it resolved the issue and mapreduce ran
smoothly.
thanks thanks for the article
Thanks and Regards
Atul Rajan
-Sent from my iPhone
On 02-Sep-2017, at 1:17 AM, Akira Aj
Hi Nishant,
Multicast is used to communicate between Ganglia daemons by default and it is
banned in AWS EC2.
Would you try unicast setting?
Regards,
Akira
On 2017/08/04 12:37, Nishant Verma wrote:
Hello
We are supposed to collect hadoop metrics and see the cluster health and
performance. I
Hi Nishant,
The debug message shows there are not enough racks configured to satisfy the
rack awareness.
http://hadoop.apache.org/docs/r3.0.0-alpha4/hadoop-project-dist/hadoop-common/RackAwareness.html
If you don't need to place replicas in different racks, you can simply ignore
the debug mess
Hi sidharth,
Would you ask Spark related question to the user mailing list of Apache Spark?
https://spark.apache.org/community.html
Regards,
Akira
On 2017/08/28 11:49, sidharth kumar wrote:
Hi,
I have configured apace spark over yarn. I am able to run map reduce job
successfully but spark-sh
Hi Atul,
Have you added HADOOP_MAPRED_HOME to yarn.nodemanager.env-whitelist in
yarn-site.xml?
The document may help:
http://hadoop.apache.org/docs/r3.0.0-alpha4/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_a_Single_Node
Regards,
Akira
On 2017/08/29 17:45, Atul Rajan wrote:
H
Most of the applications are twill apps and are some what long running, but
not perpetual, a few hours to a day. Many of the apps (say about half) have
a lot of idle time. These apps come from across the enterprise, Idk why
they're idle. There are also a few MR, TEZ, and Spark apps in the mix.
If
Hi Corne!
Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on
https://hadoop.apache.org/mailing_lists.html
Thanks
On Sun, Aug 27, 2017 at 10:25 PM, Corne Van Rensburg
wrote:
> [image: Softsure]
>
> unsubscribe
>
>
>
> *Corne Van RensburgManaging Director Softsure*
> [ima
Hi Dominique,
Please send an email to user-unsubscr...@hadoop.apache.org as mentioned on
https://hadoop.apache.org/mailing_lists.html
Thanks
Ravi
2017-08-26 10:49 GMT-07:00 Dominique Rozenberg :
> unsubscribe
>
>
>
>
>
> [image: cid:image001.jpg@01D10A65.E830C520]
>
> *דומיניק רוזנברג*, מנהלת פ
uneet" , "common-u...@hadoop.apache.org"
Subject: Re: Recommendation for Resourcemanager GC configuration
Hi Puneet,
Along with the heap dump details, I would also like to know the version of the
Hadoop-Yarn being used, size of the cluster, all Memory configurations, and JRE
version.
Hi Vinod,
The heap size is 40GB and NewRatio is set to 3. We have max completed
applications set to 10.
Regards,
Puneet
From: Vinod Kumar Vavilapalli
Date: Wednesday, August 23, 2017 at 5:47 PM
To: "Ravuri, Venkata Puneet"
Cc: "common-u...@hadoop.apache.org"
Subject
Hello Istavan,
Thanks for the help it worked finally
There was firewall issue solving that part made the hdfs work and take entry
from local file system.
Thanks and Regards
Atul Rajan
-Sent from my iPhone
On 28-Aug-2017, at 11:20 PM, István Fajth wrote:
Hi Atul,
as suggested before, set th
" densely pack containers on fewer nodes" : quite surprising, +1 with Daemon
You have Yarn labels that can be used for that.
Classical example are the need of specific hardware fir some processing.
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
Regards,
Philippe
Perhaps you can go into a bit more detail? Especially for e.g. a map job
(or reduce in mapR), this seems like a major antipattern.
*Daemeon C.M. ReiydelleSan Francisco 1.415.501.0198London 44 020 8144 9872*
On Mon, Aug 28, 2017 at 3:37 PM, Grant Overby
wrote:
> When YARN receives a request
Hi Atul,
as suggested before, set the blockmanager log level to debug, and check
logs for reasons. You can either set the whole NameNode log to DEBUG level,
and see for the messages logged by the BlockManager. Around the INFO level
message in the NameNode log similar to the message you see now in
DataNodes were having issue earlier, i added the ports required in the
iptables after that data logs are running but HDFS not able to distribute
the file and make blocks. and any file copied on the cluster is throwing
this error.
On 28 August 2017 at 21:46, István Fajth wrote:
> Hi Atul,
>
> you
Hi Atul,
you can check NameNode logs if the DataNodes were in service or there were
issues with them. As well you can check for BlockManager's debug level logs
for more exact reasons if you can reproduce the issue at will.
Istvan
On Aug 28, 2017 17:56, "Atul Rajan" wrote:
Hello Sir,
when i am
Please UNSUBSCRIBE too!
On Mon, Aug 28, 2017 at 1:25 AM, Corne Van Rensburg
wrote:
> [image: Softsure]
>
> UNSUBSCRIBE
>
>
>
> *Corne Van RensburgManaging Director Softsure*
> [image: Tel] 044 805 3746
> [image: Fax]
> [image: Email] co...@softsure.co.za
> *Softsure (Pty) Ltd | Registration No.
reducing the total block count as a work around to the problem.
Regards
Om Prakash
From: Gurmukh Singh [mailto:gurmukh.dhil...@yahoo.com]
Sent: 25 August 2017 17:22
To: omprakash ; brahmareddy.batt...@huawei.com
Cc: 'surendra lilhore' ; user@hadoop.apache.org
Subject: Re: Namenod
*Subject:* RE: Namenode not able to come out of SAFEMODE
Hi Omprakash,
The reported blocks 0 needs additional 6132675 blocks to reach the
threshold 0.9990 of total blocks 6138814. The number of *live
datanodes 0* has reached the minimum number 0.
---> By seeing this message looks l
Hi,
I suggest you use shell command for accessing cluster info instead of curl
command.
For hdfs shell command you can refer
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html
For yarn shell command you can refer
https://hadoop.apache.org/docs/current/had
Hello Team,
I come to resolution of this issue by allowing the iproute table entry for
the specific ports used for namenode as well as datanode. now hdfs is
running and cluster is running.
thanks a lot for the suggestion. now i have another issue of interface as i
am running console view of RHEL
Hi Atul,
Please can you share the datanode exception logs ?. Check if namenode and
datanode hostname mapping is proper or not in /etc/hosts.
Put operation is failing because datanode’s are not connected to the namenode.
-Surendra
From: Atul Rajan [mailto:atul.raja...@gmail.com]
Sent: 24 Augus
What is the ResourceManager JVM’s heap size? What is the value for the
configuration yarn.resourcemanager.max-completed-applications?
+Vinod
> On Aug 23, 2017, at 9:23 AM, Ravuri, Venkata Puneet wrote:
>
> Hello,
>
> I wanted to know if there is any recommendation for ResourceManager GC
> s
Hi Puneet,
Along with the heap dump details, I would also like to know the version of
the Hadoop-Yarn being used, size of the cluster, all Memory configurations,
and JRE version.
Also if possible can you share the rational behind the choice for Parallel
GC collector over others (CMS or G1) ?
Rega
Hi Puneet
Can you take a heap dump and see where most of the churn is? Is it lots of
small applications / few really large applications with small containers
etc. ?
Cheers
Ravi
On Wed, Aug 23, 2017 at 9:23 AM, Ravuri, Venkata Puneet
wrote:
> Hello,
>
>
>
> I wanted to know if there is any reco
check-interval : This is more a function of how
busy your datanodes are (sometimes they are too busy to heartbeat) and how
robust is your network (dropping heartbeat packets). It doesn't really take
too long to *check* the last heartbeat time of datanodes, but its a lot of
work to order re-replicat
I am currently supporting Single Name Service is HA, Based on QJM with 0.9
PT data with 55-58 Million Object [ files + Blocks ] with 36G of JVM heap
with G1GC.
I would recommend starting with 16G and scale depending on your blocks with
G1GC Garbage collection.
Thanks,
On Fri, Aug 18, 2017 at 4
400GB as heap space for Namenode is bit high. The GC pause time will be
very high.
For a cluster with about 6PB, approx 20GB is decent memory.
As you mentioned it is HA, so it is safe to assume that the fsimage is
check pointed at regular intervals and we do not need to worry during a
manual
Hi Heitor!
Welcome to the Hadoop community.
Think of the "hadoop distcp" command as a script which launches other JAVA
programs on the Hadoop worker nodes. The script collects the list of
sources, divides it among the several worker nodes and waits for the worker
nodes to actually do the copying
Also, the cluster is on AWS. Security group set to allow all inbound and
outbound traffic...
Any ideas?...
On 08/16/2017 12:37 PM, Michael Chen wrote:
Hi,
I've run into a ZooKeeper connection error during the execution of a
Nutch hadoop job. The tasks stall on connection error to ZooKeeper
[image: cid:image004.png@01D19182.F24CA3E0]
>
>
>
> *From:* Harsh J [mailto:ha...@cloudera.com]
> *Sent:* Wednesday, August 9, 2017 3:01 PM
> *To:* David Robison ; user@hadoop.apache.org
> *Subject:* Re: Forcing a file to update its length
>
>
>
> I don't think it
@hadoop.apache.org
Subject: Re: Forcing a file to update its length
I don't think it'd be safe for a reader to force an update of length at the
replica locations directly. Only the writer would be perfectly aware of the DNs
in use for the replicas and their states, and the precise coun
I don't think it'd be safe for a reader to force an update of length at the
replica locations directly. Only the writer would be perfectly aware of the
DNs in use for the replicas and their states, and the precise count of
bytes entirely flushed out of the local buffer. Thereby only the writer is
i
Hi David!
A FileSystem class is an abstraction for the file system. It doesn't make
sense to do an hsync on a file system (should the file system sync all
files currently open / just the user's etc.) . With appropriate flags maybe
you can make it make sense, but we don't have that functionality.
09 PM, duanyu teng wrote:
> Hi,
>
> I modify the MapTask.java file in order to output more log information. I
> re-compile the file and deploy the jar to the whole clusters, but I found
> that the output log has not changed, I don't know why.
>
modify the MapTask.java but no change
Hi,
I modify the MapTask.java file in order to output more log information. I
re-compile the file and deploy the jar to the whole clusters, but I found that
the output log has not changed, I don't know why.
Hi Kevin,
The check that’s carried out is the following(pseudo-code) -
If(user_id < min_user_id && user_not_in_allowed_system_users) {
return “user banned”;
}
If(user_in_banned_users_list) {
return “user banned”;
}
In your case, you can either bump up the min user id to a higher number and
On 25 July 2017 at 03:21, Erik Krogen wrote:
> Hey Kevin,
>
> Sorry, I missed your point about using auth_to_local. You're right that you
> should be able to use that for what you're trying to achieve. I think it's
> just that your rule is wrong; I believe it should be:
>
> RULE:[2:$1@$0](jh
/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java#L805
to find out how re-replications are ordered. (If you start the Namenode
with environment variable "export HADOOP_NAMENODE_OPTS='-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=1049'
Hi all,
fyi this is the ticket I opened up:
https://issues.apache.org/jira/browse/MAPREDUCE-6923
Thanks in advance!
Robert
On Mon, Jul 31, 2017 at 10:21 PM, Ravi Prakash wrote:
> Hi Robert!
>
> I'm sorry I do not have a Windows box and probably don't understand the
> shuffle process well enoug
Hi Surendra,
Thanks a lot for the help. After adding this jar the error is gone.
Regards
Om Prakash
From: surendra lilhore [mailto:surendra.lilh...@huawei.com]
Sent: 31 July 2017 18:25
To: omprakash ; Brahma Reddy Battula
; 'user'
Subject: RE: No FileSystem for scheme:
Hi Ravi,
thanks a lot for your response and the code example!
I think this will help me a lot to get started .I am glad to see that my
idea is not to exotic.
I will report if I can adapt the solution for my problem.
best regards
Ralph
On 31.07.2017 22:05, Ravi Prakash wrote:
Hi Ralph!
Alth
Hi Robert!
I'm sorry I do not have a Windows box and probably don't understand the
shuffle process well enough. Could you please create a JIRA in the
mapreduce proect if you would like this fixed upstream?
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=116&projectKey=MAPREDUCE
Th
Hi Ralph!
Although not totally similar to your use case, DistCp may be the closest
thing to what you want.
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java
. The client builds a file list, and then submits an MR job to copy ov
: 31 July 2017 18:10
To: Brahma Reddy Battula; 'user'
Subject: RE: No FileSystem for scheme: hdfs when using hadoop-2.8.0 jars
Hi,
I am executing the client from eclipse from my dev machine. The Hadoop cluster
is a remote machine. I have added the required jars(including
hadoop-hdfs-2.8
From: Brahma Reddy Battula [mailto:brahmareddy.batt...@huawei.com]
Sent: 31 July 2017 16:15
To: omprakash ; 'user'
Subject: RE: No FileSystem for scheme: hdfs when using hadoop-2.8.0 jars
Looks jar(hadoop-hdfs-2.8.0.jar) is missing in the classpath.Please check
the client
Looks jar(hadoop-hdfs-2.8.0.jar) is missing in the classpath.Please check the
client classpath.
Might be there are no permissions OR missed the this jar while copying..?
Reference:
org.apache.hadoop.fs.FileSystem#getFileSystemClass
if (clazz == null) {
throw new UnsupportedFileSystemExcepti
Determine what is meant by "disaster recovery". What are the scenarious,
what data.
Architect to the business need, not the buzz words
*“Anyone who isn’t embarrassed by who they were last year probably isn’t
learning enough.” - Alain de Botton*
*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198Londo
Hi Rajila,
Sorry for the delayed reply,
You can refer to
http://hadoop.apache.org/docs/r3.0.0-alpha4/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Counters
or more detailed info is available in the book* "Hadoop- The Definitive
Guide, 4th Edition" -> Chapter 9 MapRedu
Hi Nishant!
You should be able to look at the datanode and nodemanager log files to
find out why they died after you ran the 76 mappers. It is extremely
unusual (I haven't heard of a verified case for over 4-5 years) of a job
killing nodemanagers unless your cluster is configured poorly. Which
con
Take a look at the 2.7.3 docs on rolling upgrade for HDFS:
http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
I don't think there's similar existing docs for YARN, but your cluster
description sounds like you're only using HDFS anyways.
On Fri, Jul 28, 2
Hi Jaxon!
MapReduce is just an application (one of many including Tez, Spark, Slider
etc.) that runs on Yarn. Each YARN application decides to log whatever it
wants. For MapReduce,
https://github.com/apache/hadoop/blob/27a1a5fde94d4d7ea0ed172635c146d594413781/hadoop-mapreduce-project/hadoop-mapred
ng blocks
> on DN2.
>
>
>
> Can this be related to properties I added for increasing replication rate?
>
>
>
> Regards
>
> Om Prakash
>
>
>
> *From:* Ravi Prakash [mailto:ravihad...@gmail.com]
> *Sent:* 27 July 2017 01:26
> *To:* omprakash
> *Cc:* user
Nishant,
Sorry about the late reply. You may want to check out
https://ambari.apache.org/mail-lists.html to see if the Ambari user list
can answer your question better.
William Watson
Lead Software Engineer
J.D. Power O2O
http://www.jdpower.com/data-and-analytics/media-and-marketing-solutions-o2o
Yes, all the files passed must pre-exist. In this case, you would need to run
something as follows:
curl -i -X POST
"http://HOST/webhdfs/v1/PATH_TO_YOUR_HDFS_FOLDER/part-01-00-000?user.name=hadoop&op=CONCAT&sources=PATH_TO_YOUR_HDFS_FOLDER/part-02-00-000,PATH_TO_YOUR_HDFS_FOLDER/part-04-
Hi, Wellington
All the source parts are:
-rw-r--r-- hadoop supergroup 2.43 KB 2 32 MB part-01-00-000
-rw-r--r-- hadoop supergroup 21.14 MB 2 32 MB part-02-00-000
-rw-r--r-- hadoop supergroup 22.1 MB 2 32 MB part-04-00-000
-rw-r--r-- hadoop supergroup 22.29 MB 2 32 MB part-05-00-00
[mailto:ravihad...@gmail.com]
Sent: 27 July 2017 01:26
To: omprakash
Cc: user
Subject: Re: Lots of Exception for "cannot assign requested address" in
datanode logs
Hi Omprakash!
DatanodeRegistration happens when the Datanode first hearbeats to the Namenode.
In your case, it
Hi Omprakash!
DatanodeRegistration happens when the Datanode first hearbeats to the
Namenode. In your case, it seems some other application has acquired the
port 50010 . You can check this with the command "netstat -anp | grep
50010" . Are you trying to run 2 datanode processes on the same machine
Thank you Naga & Sunil .
Naga, Would like to know more about the counters ; Are they a cluster wide
resource managed at a central location - so they can be tracked/verified
later ?!
Please advise
Thanks,
Rajila
On Tue, Jul 25, 2017 at 7:01 PM, Naganarasimha Garla <
naganarasimha...@apache.org>
Hi Rajila,
One option you can think of is using custom "counters" and
have a logic to increment them when ever you insert or have any custom
logic. These counters can be got from the MR interfaces and even in the web
ui even after the job has finished.
Regards,
+ Naga
On Tue, Jul 25
Hi Cinyoung,
Concat has some restrictions, like the need for src file having last block size
to be the same as the configured dfs.block.size. If all the conditions are met,
below command example should work (where we are concatenating /user/root/file-2
into /user/root/file-1):
curl -i -X POST
901 - 1000 of 17133 matches
Mail list logo