Hi,
In Hadoop MR unit tests, the classes uses the
./core/org/apache/hadoop/util/Tool.java, and
./core/org/apache/hadoop/util/ToolRunner.java tosubmit the job. But to
run the unit tests it seems that it's not needed the MR be running. If
so, who runs the map and reduce tasks?
--
Best regards,
P
Not sure which specific tests you are talking about. There are two types of
them:
- Real unit tests which unit test code, shouldn't run any MR jobs
- The remaining 'unit' tests are really integration tests. They start
MiniMRCluster and MiniDFSCluster (which are basically in-JVM MR and DFS)
Hello,
How can I impose read lock, for a file in HDFS
So that only one user (or) one application , can access file in hdfs at any
point of time.
Regards
Abhi
HDFS does not have such a client-side feature, but your applications
can use Apache Zookeeper to coordinate and implement this on their own
- it can be used to achieve distributed locking. While at ZooKeeper,
also checkout https://github.com/Netflix/curator which makes using it
for common needs
Thanks for reply harsh.I would look into Zookeeper.
Regards
Abhi
On Feb 22, 2013, at 1:03 AM, Harsh J ha...@cloudera.com wrote:
HDFS does not have such a client-side feature, but your applications
can use Apache Zookeeper to coordinate and implement this on their own
- it can be used to
Putting whole Logs from Task now
--
---
Task Logs: 'attempt_201302202127_0021_m_00_0'
*stdout logs*
--
*stderr logs*
log4j:WARN No appenders could be found for logger (org.apache.hadoop.hdfs.
DFSClient).
Thanks Manoj for your answer. :)
That helped.
From: Agarwal, Nikhil
Sent: Tuesday, February 19, 2013 4:53 PM
To: 'user@hadoop.apache.org'
Subject: Which class or method is called first when i run a command in hadoop
Hi All,
Thanks for your answers till now. I was trying to debug Hadoop
Hi,
I am planning to add a file system called CDMI under org.apache.hadoop.fs in
Hadoop, something similar to KFS or S3 which are already there under
org.apache.hadoop.fs. I wanted to ask that say, I write my file system for CDMI
and add the package under fs but then how do I tell the
Hi Samir
Looks like there is some syntax issue with the sql query generated internally .
Can you try doing a Sqoop import by specifying the query with -query option.
Regards
Bejoy KS
Sent from remote device, Please excuse typos
-Original Message-
From: samir das mohapatra
This may be a basic beginner debug question will appreciate if anyone can pour
some light:
Here is the method i have in Eclipse:
***
@Override
protected void setup(Context context) throws java.io.IOException,
InterruptedException {
Path[]
What Hemanth points to (fs.TYPE.impl, i.e. fs.cdmi.impl being set to
the classname of the custom FS impl.) is correct for the 1.x releases.
In 2.x and ahead, the class for a URI is auto-discovered from the
classpath (a 'service'). So as long as your jar is present on the
user's runtime, the FS
Hi Agarwal,
This repository and the corresponding README file may give you some hint
for the configuration.
https://github.com/gluster/hadoop-glusterfs
yours,
Kun Ling
On Thu, Feb 21, 2013 at 9:14 PM, Ling Kun lkun.e...@gmail.com wrote:
Hi Agarwal,
This repository and the
Some hints:
1) For features, you could start with unit tests available with hadoop fs.
For performance, compare various bench results.
3) I could see at least 2 reasons for that. It could be that your
filesystem does not support locality, so tasks are not executed on the same
node as the data.
it indicates 'cannot find com.google.protobuf '
On Feb 21, 2013 7:38 PM, Ted yuzhih...@gmail.com wrote:
What compilation errors did you get ?
Thanks
On Feb 21, 2013, at 1:37 AM, Azuryy Yu azury...@gmail.com wrote:
Hi ,
I just want to share some experience on hadoop-2.x compiling.
It
Hi Sai
The location you are seeing should be the mapred.local.dir .
From my understanding the files in distributed cache would be available in
that location while you are running the job and would be cleaned up at the end
of it.
Regards
Bejoy KS
Sent from remote device, Please excuse typos
Hello! Anyone in the NYC area recommend any of the hadoop training
classes for administrators?
Thanks a lot!
Guy
I was able to write a little code to make this happen, and submitted a
patch to Hadoop:
https://issues.apache.org/jira/browse/MAPREDUCE-5018
There is a jar file and shell script there for anybody who wants to try
this without recompiling all of Hadoop. It lets you run something like
mapstream
I am looking for an explanation of Kerberos working with Hadoop cluster .
I need to know how KDC is used by hdfs and mapred.
(Something like this :- An example of kerberos with mail server ,
https://www.youtube.com/watch?v=KD2Q-2ToloE)
How the name node and data node are prone to attacks ?
What
You should read the hadoop security design doc which you can find at
https://issues.apache.org/jira/browse/HADOOP-4487
HTH,
+Vinod Kumar Vavilapalli
Hortonworks Inc.
http://hortonworks.com/
On Feb 21, 2013, at 11:02 AM, rohit sarewar wrote:
I am looking for an explanation of Kerberos
Hello there, I'm trying to use hadoop map reduce to process an open file. The
writing process, writes a line to the file and syncs the file to readers.
(org.apache.hadoop.fs.FSDataOutputStream.sync()).
If I try to read the file from another process, it works fine, at least
using
Hi Mayur,
Where have you downloaded the DEB files? Are they Debian related? Or
Unbuntu related? Unbuntu is not worst than CentOS. They are just different
choices. Both should work.
JM
Hi Ma
2013/2/21 Harsh J ha...@cloudera.com
Try the debs from the Apache Bigtop project 0.3 release, its a bit
Mayur,
Have you looked at that?
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
I just created a VM, installed Debian 64bits, downloaded the .deb file and
installed it without any issue. Are you using Unbuntu 64bits? Or 32bits?
JM
2013/2/21 Mayur
Hi Mallika,
Have your tried neo4J ?
-Solomon
On Fri, Feb 22, 2013 at 5:22 AM, SUJIT PAL sujit@comcast.net wrote:
Hi Mallika,
Couldn't this be done from the relational database itself?
To get the group counts:
select count(*) from your_table where your_condition group by
I am using 32 bits. I will look out for your link JM sir.
On Fri, Feb 22, 2013 at 8:17 AM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
Mayur,
Have you looked at that?
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
I just created a VM,
I have considered the DistributedCache and will probably be using it, but
in order to have a file to cache I need to serialize the configuration
object first. :-)
On Thu, Feb 21, 2013 at 5:55 PM, feng lu amuseme...@gmail.com wrote:
Hi
May be you can see the useage of DistributedCache [0] ,
yes, you are right. First upload serialized configuration file to HDFS and
retrieve that file in the Mapper#configure method for each Mapper, and
deserialize the file to configuration object.
It seem that the configuration file serialization is required. You can find
many data serialization
Hazelcast is an interesting idea, but I was hoping that there is a way of
doing this in MapReduce. :-)
It didn't seem like that from the start, but I posted here just to make
sure I was not missing something.
So, I will serialize my data objects and use them accordingly.
Thanks!
On Thu, Feb
I also saw some reference to being able to run hadoop job -blacklist-host
or some such, but
http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#job doesn't show
it.
Thanks. You are correct that this strategy does not achieve a total
sort, only a partial/local sort, since that's all the application
requires. I think the technique is sometimes referred to as secondary
sort, and KeyFieldBasedPartitioner is sometimes used as a convenience
to implement it, but our
Dear Harsh J,
Firstly, Thanks for your quick and detailed reply. Your suggestion is
very helpful to me !
1. For the Hadoop MapReduce regression test:
1.1 In theory, as long as I have correctly implement all the
org.apache.hadoop.fs.FileSystem interface, the Hadoop MR should work
correctly.
Dear Julien Muller and Harsh,
Thanks very much for all your hints.
Is there any recommended applications beside wordcount and Terasort?
Thanks
Ling Kun
On Thu, Feb 21, 2013 at 9:26 PM, Julien Muller julien.mul...@ezako.comwrote:
Some hints:
1) For features, you could start with
31 matches
Mail list logo