Hello,
I am running a global sort (on Pigmix input data, size 600GB) based
on TotalOrderPartitioner. The best practice according to the literature
points to data sampling using RandomSampler. The query succeeds but takes a
very long time (7 hours) and that's because there is only one reducer
As far as I know, there's no combination of hadoop API can do that.
You can easily get the location of the block (on which DN), but there's no
way to get the local address of that block file.
On Thu, Aug 28, 2014 at 11:54 AM, Demai Ni nid...@gmail.com wrote:
Yehia,
No problem at all. I
Normally MR job is used for batch processing. So I don't think this is a
good use case here for MR.
Since you need to run the program periodically, you cannot submit a single
mapreduce job for this.
An possible way is to create a cron job to scan the folder size and submit
a MR job if necessary;
You should not use this method:
FSDataOutputStream fp = fs.create(pt, true)
Here's the java doc for this create method:
/**
* Create an FSDataOutputStream at the indicated Path.
* @param f the file to create
* @*param** overwrite if a file with this name already exists, then if
All
I am using libhdfs, I need some usage like following ,and when the JNI call
return, it had result in some Crash in JVM, Attachment is the detail
information.
JAVA
Call
JNI
Call
C LIB
Call
Libhdfs
Crash info
#
# A fatal error has
Normally files in HDFS are intended to be quite big, it's not very easy to
be shown in the browser.
On Fri, Aug 22, 2014 at 10:56 PM, Brian C. Huffman
bhuff...@etinternational.com wrote:
All,
I noticed that that on Hadoop 2.5.0, when browsing the HDFS filesystem on
port 50070, you can't
Right, please use FileSystem#append
From: Stanley Shi [mailto:s...@pivotal.io]
Sent: Thursday, August 28, 2014 2:18 PM
To: user@hadoop.apache.org
Subject: Re: Appending to HDFS file
You should not use this method:
FSDataOutputStream fp = fs.create(pt, true)
Here's the java doc for this create
Hi Users,
I want behaviour for hadoop cluster for writing/reading in following
replication cases:
1. replication=2 and in cluster (3 datatnodes+namenode) one datanode
gets down.
2. replication=2 and in cluster (3 datatnodes+namenode) 2 datanode gets
down.
BR,
Satyam
Thanks, it fixed my problem!
Von: Arpit Agarwal [mailto:aagar...@hortonworks.com]
Gesendet: Donnerstag, 28. August 2014 01:41
An: user@hadoop.apache.org
Betreff: Re: Running job issues
Susheel is right. I've fixed the typo on the wiki page.
On Wed, Aug 27, 2014 at 12:28 AM, Susheel Kumar
For 1#, since you still have 2 datanodes alive, and the replication is 2,
writing will success. (Read will success)
For 2#, now you only have 1 datanode, and the replication is 2, then initial
writing will success, but later sometime pipeline recovery will fail.
Regards,
Yi Liu
Thank you all,
It works now
Regards
rab
On 28 Aug 2014 12:06, Liu, Yi A yi.a@intel.com wrote:
Right, please use FileSystem#append
*From:* Stanley Shi [mailto:s...@pivotal.io]
*Sent:* Thursday, August 28, 2014 2:18 PM
*To:* user@hadoop.apache.org
*Subject:* Re: Appending to HDFS
#0 0x7f1e3872c425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x7f1e3872c425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x7f1e3872fb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x7f1e380a4405 in os::abort(bool) () from
Hello,
we are using Hadoop 2.2.0 (HDP 2.0), avro 1.7.4. running on CentOS 6.3
I am facing a following issue when using a AvroMultipleOutputs with dynamic
output files. My M/R job works fine for a smaller amount of data or at
least the error hasn't appear there so far. With bigger amount of data I
Hello,
I can't find any information on how possible or difficult it is to install
Hadoop as a single node on Windows 8 running Oracle Java 8. The tutorial on
Hadoop 2 on Windowshttp://wiki.apache.org/hadoop/Hadoop2OnWindows mentions
neither Windows 8 nor Java 8. Is there anything known
Currently Hadoop doesn't officially support JAVA8
Regards,
Yi Liu
From: Ruebenacker, Oliver A [mailto:oliver.ruebenac...@altisource.com]
Sent: Thursday, August 28, 2014 8:46 PM
To: user@hadoop.apache.org
Subject: Hadoop on Windows 8 with Java 8
Hello,
I can't find any information on
Or, maybe have a look at Apache Falcon:
Falcon - Apache Falcon - Data management and processing platform
Falcon - Apache Falcon - Data management and processing platform
Apache Falcon - Data management and processing platform
View on falcon.incubator.apache.org Preview by Yahoo
unsubscribe
On Thu, Aug 28, 2014 at 6:42 PM, Eric Payne eric.payne1...@yahoo.com
wrote:
Or, maybe have a look at Apache Falcon:
Falcon - Apache Falcon - Data management and processing platform
http://falcon.incubator.apache.org/
Falcon - Apache Falcon - Data management and processing
Can anyone please help me with this installation error?
After I type start-yarn.sh :
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-xx.out
localhost: ssh: connect to host localhost port 22: connection refused
when I ran jps to check, only Jps and
try 'ssh localhost' and show the output
On Thu, Aug 28, 2014 at 7:55 PM, Li Chen ahli1...@gmail.com wrote:
Can anyone please help me with this installation error?
After I type start-yarn.sh :
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-xx.out
Thank you to everyone who responded to this thread. I got couple of good
moves and got some good online courses to explore from to get some
fundamental understanding of the things.
Thanks
Amar
On Thu, Aug 28, 2014 at 10:15 AM, Sriram Balachander
sriram.balachan...@gmail.com wrote:
Hadoop The
Hi
configuration
!-- Site specific YARN configuration properties --
property
nameyarn.app.mapreduce.am.staging-dir/name
value/user/value
/property
property
nameyarn.nodemanager.aux-services/name
valuemapreduce_shuffle/value
/property
property
More information
after I started resourcemanager
[root@vm38 ~]# /etc/init.d/hadoop-yarn-resourcemanager start
Starting Hadoop resourcemanager: [ OK ]
and I open cluster web interface there is some tcp connections to 8088:
[root@vm38 ~]# netstat -np | grep 8088
tcp
Moved resourcemanager to another server and it works. I guess I have
some network miss routing there :)
Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
On
Hi All,
I am running a MRV1 job on Hadoop YARN 2.3.0 cluster , the problem is when
I submit this job YARN created multiple applications for that submitted job
, and the last application that is running in YARN is marked as complete
even as on console its reported as only 58% complete . I have
Hi,
I use Hadoop 2.4.1, I got org.apache.hadoop.io.compress.SnappyCodec not found”
error:
hadoop checknative
14/08/29 02:54:51 WARN bzip2.Bzip2Factory: Failed to load/initialize
native-bzip2 library system-native, will use pure-Java version
14/08/29 02:54:51 INFO zlib.ZlibFactory: Successfully
Stanley and all,
thanks. I will write a client application to explore this path. A quick
question again.
Using the fsck command, I can retrieve all the necessary info
$ hadoop fsck /tmp/list2.txt -files -blocks -racks
.
*BP-13-7914115-10.122.195.197-14909166276345:blk_1073742025* len=8
Hi,
It looks a problem of class path at spark side.
Thanks,
- Tsuyoshi
On Fri, Aug 29, 2014 at 8:49 AM, arthur.hk.c...@gmail.com
arthur.hk.c...@gmail.com wrote:
Hi,
I use Hadoop 2.4.1, I got org.apache.hadoop.io.compress.SnappyCodec not
found” error:
hadoop checknative
14/08/29 02:54:51
Hello, I have a question about using the cached data in memory via centralized
cache management. I cached the data what I want to use through the CLI (hdfs
cacheadmin -addDirectives ...).Then, when I write my mapreduce application, how
can I read the cached data in memory? Here is the source
28 matches
Mail list logo