Thanks, Qin. It sounds like you're saying that this type of
partitioning needs its own map-reduce set.
I was hoping it could be done in the InputFormat class :))
Shirley
On Aug 4, 2008, at 2:49 PM, Qin Gao wrote:
For the first question, I think it is better to do it at reduce stage,
because
To be honest, I have permissions turned off on my DFS (set the config variable
"dfs.permissions" to be "false").
Poking around in the source code, from
hadoop/core/trunk/src/core/org/apache/hadoop/security/UnixUserGroupInformation.java
it looks like you can set the config variable "hadoop.job.u
OK. I guess I find out how. Override the "configure" method of user
defined Map class so that you can take note of the filename.
-Kevin
On Mon, Aug 4, 2008 at 3:53 PM, Kevin <[EMAIL PROTECTED]> wrote:
> Is it possible to get this information in user defined map function?
> i.e., how do we get t
assumption -- if I run stop-all.sh _successfully_ on a Hadoop deployment
(which means every node in the grid is using the same path to Hadoop), then
that Hadoop installation becomes invisible, and then any other Hadoop
deployment could start up and take its place on the grid. Let me know if
this as
Is it possible to get this information in user defined map function?
i.e., how do we get the JobConf object in map() function?
Another way is to subclass RecordReader to embed file-name in the
data, which does not look simple.
-Kevin
On Sun, Aug 3, 2008 at 10:17 PM, Amareshwari Sriramadasu
<[E
Thank you! The java code is exactly what I want.
Following your code, I encounter the user permission issue when trying
to write to a file. I wonder if the user id could be manipulated in
the protocol.
-Kevin
On Mon, Aug 4, 2008 at 2:27 PM, Michael Bieniosek <[EMAIL PROTECTED]> wrote:
> You ca
You can make shell calls:
hadoop/bin/hadoop fs -fs namenode.example.com:1 -ls /
If you're in java, you can use the org.apache.hadoop.fs.FileSystem class:
Configuration config = new Configuration();
config.set("fs.default.name", "namenode.example.com:1")
FileSystem fs = FileSystem.get(con
Hi there,
I am trying to use the DFS of hadoop in other applications. It is not
clear to me how that could be carried out easily. Could any one give a
direction to go or examples? Thank you.
-Kevin
I see. I think I could also modify the hadoop-env.sh in the new conf/
folders per datanode to point
to the right place for HADOOP_HOME.
On Mon, Aug 4, 2008 at 3:21 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
>
>
>
> On 8/4/08 11:10 AM, "Meng Mao" <[EMAIL PROTECTED]> wrote:
> > I suppose I cou
For the first question, I think it is better to do it at reduce stage,
because the partitioner only consider the size of blocks in bytes. Instead
you can output the intermediate key/value pair as this:
key: 1 if C=1,3,5,7. 0 otherwise
value: the tuple.
In reducer you can have a reducer deal w
Hi,
I want to implement some data partitioning logic where a mapper is
assigned a specific range of values. Here is a concrete example of
what I have in mind:
Suppose I have attributes A, B, C and the following tuples:
(A, B, C)
(1, 3, 1)
(1, 2, 2)
(1, 2, 3)
(12, 3, 4)
(12, 2, 5)
(12, 8, 6
On 8/4/08 11:10 AM, "Meng Mao" <[EMAIL PROTECTED]> wrote:
> I suppose I could, for each datanode, symlink things to point to the actual
> Hadoop installation. But really, I would like the setup that is hinted as
> possible by statement 1). Is there a way I could do it, or should that bit
> of do
I'm trying to set up 2 Hadoop installations on my master node, one of which
will have permissions that allow more users to run Hadoop.
But I don't really need anything different on the datanodes, so I'd like to
keep those as-is. With that switch, the HADOOP_HOME on the master will be
different from
We had seen similar exception earlier reported by others on the list. What you
might want to try is to use a hex editor or equivalent to open up 'edits' and
get rid of the last record. In all cases, the last record might not be complete
so your namenode is not starting. Once you update your edit
2008-08-03 21:58:33,108 INFO org.apache.hadoop.ipc.Server: Stopping
server on 9000
2008-08-03 21:58:33,109 ERROR org.apache.hadoop.dfs.NameNode:
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:178)
at org.apache.hadoop.io.UTF8.readFields(UTF8.ja
I have the same thing:
ERROR dfs.NameNode: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:87)
at
org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:455)
at o
I'm getting the following exceptions while starting the name node -
ERROR dfs.NameNode: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at
org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:87)
at
org.apache.hadoop.dfs.FSEditL
This can be done very easily setting the number of mappers you want -
jobConf.setNumMapTasks() and use input format -
MultiFileWordCount.MyInputFormat.class which is a concrete
implementation of MultiFileInputFormat.
-Original Message-
From: Jason Venner [mailto:[EMAIL PROTECTED]
Sent: Sa
18 matches
Mail list logo