Hi Team,
I have a file which has semi structured text data with no definite start
and end points.
How can i send the entire content of the file at once as key or value to
the mapper instead of line by line.
Thanks,
Subbu
Hi Subbu. Sounds like you'll have to implement a custom non-splittable
InputFormat which instantiates a custom RecordReader which in turn consumes
the entire file when it's next(K,V) method is called. Once implemented, you
specify the input format to the JobConf object:
Hi Team,
I have a file which has semi structured text data with no definite start
and end points.
How can i send the entire content of the file at once as key or value to
the mapper instead of line by line.
Thanks,
Subbu
Hi,
You could send the file meta info to the map function as key/value through
the split, and then you can read the entire file in your map function.
Thanks
Devaraj k
-Original Message-
From: Kasi Subrahmanyam [mailto:kasisubbu...@gmail.com]
Sent: 11 July 2013 13:38
To:
Hi,
It seems mahout-examples-0.7-job.jar is depending on other jars/classes.
While running Job Tasks it is not able to find those classes in the
classpath and failing those tasks.
You need to provide the dependent jar files while submitting/running Job.
Thanks
Devaraj k
--
View this
Hello Hadoop Community!
I've setup datanodes with private network by adding private hostname's to
the slaves file.
but it looks like when i lookup on the webUI datenodes are registered with
public hostnames.
are they actually networking with public network?
all datanodes have eth0 with public
have you tried playing with this config parameter
dfs.datanode.dns.interface
?
On Thu, Jul 11, 2013 at 4:20 AM, Ben Kim benkimkim...@gmail.com wrote:
Hello Hadoop Community!
I've setup datanodes with private network by adding private hostname's to
the slaves file.
but it looks like when
make sure that your hostnames resolved ( dns or/and hosts files ) with
private IPs.
if you have records in the nodes hosts files like
public IP hosname
remove (or comment) them
Alex
On Jul 11, 2013 2:21 AM, Ben Kim benkimkim...@gmail.com wrote:
Hello Hadoop Community!
I've setup datanodes
Here are logs of RM and 2 NMs:
RM (master-host): http://pastebin.com/q4qJP8Ld
NM where AM ran (slave-1-host): http://pastebin.com/vSsz7mjG
NM where slave container ran (slave-2-host): http://pastebin.com/NMFi6gRp
The only related error I've found in them is the following (from RM logs):
...
Hi
I have tow nodes:
n1 (master, salve) and n2 (slave)
after set up I ran wordcount example and it worked fine:
[hduser@n1 ~]$ hadoop jar /usr/local/hadoop/hadoop-examples-1.0.4.jar
wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output
13/07/11 15:30:44 INFO input.FileInputFormat:
Hi,
I am running a hdfs on Amazon EC2
Say, I have a ftp server where stores some data.
I just want to copy these data directly to hdfs in a parallel way (which
maybe more efficient).
I think hadoop distcp is what I need.
But
$ bin/hadoop distcp
Hi All,
Can anyone help me the link or document that explain the below.
How Cloudera Manager works and handle the clusters (Agent and Master
Server)?
How the Cloudera Manager Process Flow works?
Where can I locate Cloudera configuration files and explanation in brief?
Regards
Sathish
sorry for typo,
mahout, not mahou. sent from mobile
On Jul 11, 2013 9:40 PM, Azuryy Yu azury...@gmail.com wrote:
hi,
put all mahou jars under hadoop_home/lib, then restart cluster.
On Jul 11, 2013 8:45 PM, Margusja mar...@roo.ee wrote:
Hi
I have tow nodes:
n1 (master, salve) and n2
Hi,
Go through the links.
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager-Managing-Clusters/cmmc_CM_architecture.html
Than you, it resolved the problem.
Funny, I don't remember that I copied mahout libs to n1 hadoop but there
they are.
Tervitades, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
skype: margusja
-BEGIN PUBLIC KEY-
So in my driver code, I try to store the file in the cache with this line of
code:
job.addCacheFile(new URI(file location));
Then in my Mapper code, I do this to try and access the cached file:
URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);
However, I get a
Sathish, this mailing list for Apache Hadoop related questions. Please post
questions related to other distributions to appropriate vendor's mailing
list.
On Thu, Jul 11, 2013 at 6:28 AM, Sathish Kumar sa848...@gmail.com wrote:
Hi All,
Can anyone help me the link or document that explain
Hello,
I am wondering how memory counters 'PHYSICAL_MEMORY_BYTES' and
'VIRTUAL_MEMORY_BYTES' are calculated? They are peaks of memory usage or
cumulative usage?
Thanks for help,
Satish,
the right alias for Cloudera Manager questions scm-us...@cloudera.org
Thanks
On Thu, Jul 11, 2013 at 9:20 AM, Suresh Srinivas sur...@hortonworks.comwrote:
Sathish, this mailing list for Apache Hadoop related questions. Please
post questions related to other distributions to
Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
api which is returning null..
Path[] cachedFilePaths =
context.getLocalCacheFiles(); // I am checking why it is
deprecated...
for (Path cachedFilePath : cachedFilePaths) {
File cachedFile = new
On 11 July 2013 06:27, Hao Ren h@claravista.fr wrote:
Hi,
I am running a hdfs on Amazon EC2
Say, I have a ftp server where stores some data.
I just want to copy these data directly to hdfs in a parallel way (which
maybe more efficient).
I think hadoop distcp is what I need.
Map Side joins will use the CompositeInputFormat. They will only really be
worth doing if one data set is small, and the other is large.
This is a good example :
http://www.congiu.com/joins-in-hadoop-using-compositeinputformat/
the trick is to google for CompositeInputFormat.compose() :)
Sorry I should've specified that I need an example of CompositeInputFormat that
uses the new API.
The example linked below uses old API objects like JobConf.
Any known examples of CompositeInputFormat using the new API?
Thanks in advance,
Andrew
From: Jay Vyas [mailto:jayunit...@gmail.com]
Hi , I'm getting an ungoogleable exception, never seen this before.
This is on a hadoop 1.1. cluster... It appears that its permissions
related...
Any thoughts as to how this could crop up?
I assume its a bug in my filesystem, but not sure.
13/07/11 18:39:43 ERROR security.UserGroupInformation:
I figured out the issue!
The problem was in the permission to rum Hadoop scripts from root user. I
create a dedicated hadoop user to rum hadoop cluster but one of the time i
accidentally started hadoop from root. Hence, some of the permissions of
hadoop scripts changed.
The solution is to again
Hi Andrew,
You could make use of hadoop data join classes to perform the join or you can
refer these classes for better idea to perform join.
http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-tools/hadoop-datajoin
Thanks
Devaraj k
From: Botelho, Andrew [mailto:andrew.bote...@emc.com]
Hi Jay,
Here client is trying to create a staging directory in local file system,
which actually should create in HDFS.
Could you check whether do you have configured fs.defaultFS configuration in
client with the HDFS.
Thanks
Devaraj k
From: Jay Vyas [mailto:jayunit...@gmail.com]
Sent:
27 matches
Mail list logo