In hadoop-0.17
we tried to use a 2.2GB archive and seemingly ran into
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6599383:
java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:114)
at java.util.
Hello, guys,
I am very new to hadoop. I was trying to read nutch data files using a
script i found on http://wiki.apache.org/nutch/Getting_Started . And after 2
days of trying, I still cannot get it to work. now the error i got is
"java.lang.RuntimeException: java.io.IOException: WritableName can'
Wow, if the issue is fixed with version 0.20, then could we please have
a patch for version 0.18?
Thanks,
Deepika
-Original Message-
From: Grant Ingersoll [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 30, 2008 12:19 PM
To: core-user@hadoop.apache.org
Subject: Re: "Merge of the inmem
It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted via
fuse and uses that for some operations.
There have recently been some problems with fuse-dfs when used in a
multithreaded environment, but those have been fixed in 0.18.2 and 0.19. (do
not use 0.18 or 0.18.1)
The
On Oct 31, 2008, at 3:15 PM, Bhupesh Bansal wrote:
Why do we need these setters in JobConf ??
jobConf.setMapOutputKeyClass(String.class);
jobConf.setMapOutputValueClass(LongWritable.class);
Just historical. The Mapper and Reducer interfaces didn't use to be
generic. (Hadoop used to run on
Config on most Y! clusters sets dfs.datanode.max.xcievers to a large
value .. something like 1k to 2k. You could try that.
Raghu.
Nathan Marz wrote:
Looks like the exception on the datanode got truncated a little bit.
Here's the full exception:
2008-10-31 14:20:09,978 ERROR org.apache.hado
Hey guys,
Just curious,
Why do we need these setters in JobConf ??
jobConf.setMapOutputKeyClass(String.class);
jobConf.setMapOutputValueClass(LongWritable.class);
We should be able to extract these from
OutputController of Mapper class ??
IMHO, they have to be consistent with OutputCollecto
Currently, I'm just researching so I'm just playing with the idea of
streaming log data into the HDFS.
I'm confused about: "...all you need is a Hadoop install. Your production
node doesn't need to be a
datanode." If my production node is *not* a dataNode then how can I do
"hadoop dfs put?"
I w
Looks like the exception on the datanode got truncated a little bit.
Here's the full exception:
2008-10-31 14:20:09,978 ERROR org.apache.hadoop.dfs.DataNode:
DatanodeRegistration(10.100.11.115:50010,
storageID=DS-2129547091-10.100.11.115-50010-1225485937590,
infoPort=50075, ipcPort=50020):D
Hello,
We are seeing some really bad errors on our hadoop cluster. After
reformatting the whole cluster, the first job we run immediately fails
with "Could not find block locations..." errrors. In the namenode
logs, we see a ton of errors like:
2008-10-31 14:20:44,799 INFO org.apache.hado
Hi Devraj,
It was pretty consistent with my comparator class in my old email(the
one that uses UTF8). While trying to resolve the issue, I changed UTF8
to Text. That made it disappear for a while but then it came back again.
My new Comparator class(with Text) is -
public class IncrementalURLInde
Hi,
We have deployed a new monitoring system Chukwa (
http://wiki.apache.org/hadoop/Chukwa) that is doing exactly that.
Also this system provide an easy way to post-process you log file and
extract useful information using M/R.
/Jerome.
On 10/31/08 1:46 PM, "Norbert Burger" <[EMAIL PROTECTED]> w
What are you using to "stream logs into the HDFS"?
If the command-line tools (ie., "hadoop dfs put") work for you, then all you
need is a Hadoop install. Your production node doesn't need to be a
datanode.
On Fri, Oct 31, 2008 at 2:35 PM, shahab mehmandoust <[EMAIL PROTECTED]>wrote:
> I want to
Hey all,
libhdfs prints out useful information to stderr in the function
errnoFromException; unfortunately, in the C application framework I
use, the stderr is directed to /dev/null, making debugging miserably
hard.
Does anyone have any suggestions to make the errnoFromException
functio
Definitely speaking java Do you think I'm being paranoid about the
possible load?
Shahab
On Fri, Oct 31, 2008 at 11:52 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:
> Shahab,
>
> This can be done.
> If you client speaks java you can connect to hadoop and write as a stream.
>
> If you client
Shahab,
This can be done.
If you client speaks java you can connect to hadoop and write as a stream.
If you client does not have java. The thrift api will generate stubs
in a variety of languages
Thrift API: http://wiki.apache.org/hadoop/HDFS-APIs
Shameless plug -- If you just want to stream da
I want to stream data from logs into the HDFS in production but I do NOT
want my production machine to be apart of the computation cluster. The
reason I want to do it in this way is to take advantage of HDFS without
putting computation load on my production machine. Is this possible*?*
Furthermor
True, dfs.http.address is the NN Web UI address.
This where the NN http server runs. Besides the Web UI there also
a servlet running on that server which is used to transfer image
and edits from NN to the secondary using http get.
So SNN uses both addresses fs.default.name and dfs.http.address.
W
I will also be presenting on Mahout (machine learning) on Wednesday at
3:30 (I think). It will have some Hadoop flavor in it.
-Grant
On Oct 31, 2008, at 1:46 PM, Owen O'Malley wrote:
Just a reminder that ApacheCon US is next week in New Orleans. There
will be a lot of Hadoop developers and
Hi,
Hope somebody will record at least fraction of these talks and put them on
the web as soon as possible.Lukas
On Fri, Oct 31, 2008 at 6:46 PM, Owen O'Malley <[EMAIL PROTECTED]> wrote:
> Just a reminder that ApacheCon US is next week in New Orleans. There will
> be a lot of Hadoop developers an
Owen,
Just wanted to mention that there is a talk on Hive as well on Friday 9:30AM...
Ashish
-Original Message-
From: Owen O'Malley [mailto:[EMAIL PROTECTED]
Sent: Friday, October 31, 2008 10:47 AM
To: [EMAIL PROTECTED]
Cc: core-user@hadoop.apache.org
Subject: ApacheCon US 2008
Just a r
Just a reminder that ApacheCon US is next week in New Orleans. There
will be a lot of Hadoop developers and talks. (I'm CC'ing core-user
because it has the widest coverage. Please join the low traffic
[EMAIL PROTECTED] list for cross sub-project announcements.)
* Hadoop Camp with lots o
Otis Gospodnetic wrote:
Konstantin & Co, please correct me if I'm wrong, but looking at
hadoop-default.xml makes me think that dfs.http.address is only the URL for the NN
*Web UI*. In other words, this is where we people go look at the NN.
The secondary NN must then be using only the Primary
Hi,
could anyone tell me what the current Status of FUSE support for HDFS
is? Is this something that can be expected to be usable in a few
weeks/months in a production environment? We have been really
happy/successful with HDFS in our production system. However, some
software we use in our applic
You can override this property by passing in -jobconf
mapred.output.compress=false to the hadoop binary, e.g.
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-0.18.0-streaming.jar
-input "/user/root/input" -mapper 'cat' -reducer 'wc -l' -output
"/user/root/output" -jobconf mapred.job.name="Exp
The Hadoop file API allows you to open a file based on URL
Path file = new
Path("hdfs://hadoop00:54313/user/hadoop/conflated.20081016/part-9");
JobConf job = new JobConf(new Configuration(),
ReadFileHadoop.class);
job.setJobName("test");
FileSyste
Alex Loddengaard wrote:
Thanks, Steve. I'll look in to this patch. As a temporary solution I use a
log4j variable to manually set a "hostname" private field in the Appender.
This solution is rather annoying, but it'll work fro now.
Thanks again.
what about having the task tracker pass down a
Hi All,
I'm wondering that can I browse a HDFS file in URL (ex.
http://host/test.jpeg) so that I can show this file on my webapp
directly.
Thanks,
Neal
This correspondence is from Cyberlink Corp. and is intended only for use by the
recipient named herein,
and may contain privileged, pro
Thanks, Steve. I'll look in to this patch. As a temporary solution I use a
log4j variable to manually set a "hostname" private field in the Appender.
This solution is rather annoying, but it'll work fro now.
Thanks again.
Alex
On Fri, Oct 31, 2008 at 3:58 AM, Steve Loughran <[EMAIL PROTECTED]>
Alex Loddengaard wrote:
I'd like my log messages to display the hostname of the node that they were
outputted on. Sure, this information can be grabbed from the log filename,
but I would like each log message to also have the hostname. I don't think
log4j provides support to include the hostnam
To complete the picture: not only was our network swamped, I realized
tonight that the NameNode/JobTracker was running on a 99% full disk (it hit
100% full about thirty minutes ago). That poor JobTracker was fighting
against a lot of odds. As soon as we upgrade to a bigger disk and switch it
back o
31 matches
Mail list logo