Hi,
HDFS splits files into blocks, and mapreduce runs a map task for each
block. However, Fields could be changed in IIS log files, which means
fields in one block may depend on another, and thus make it not suitable
for mapreduce job. It seems there should be some preprocess before storing
and
You can run a mapreduce firstly, Join these data sets into one data set.
then analyze the joined dataset.
On Mon, Dec 30, 2013 at 3:58 PM, Fengyun RAO raofeng...@gmail.com wrote:
Hi,
HDFS splits files into blocks, and mapreduce runs a map task for each
block. However, Fields could be
what do you mean by join the data sets?
a fake sample log file:
#Software: Microsoft Internet Information Services 7.5
#Version: 1.0
#Date: 2013-07-04 20:00:00
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port
cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status
I am trying to write an object into hdfs .
public static Split *currentsplit *= new Split();
*Split currentsplit = new Split();*
*Path p = new Path(C45/mysavedobject);*
*ObjectOutputStream oos = new ObjectOutputStream(fs.create(p));*
*oos.writeObject(currentsplit);*
*oos.close();*
But I am not
Hi,
I am using multiple outputs in our job. So whenever any reduce task fails,
all it's next task attempts are failing with file exist exception.
The output file name should also append the task attempt right? But it's
only appending the task id. Is this the bug or Some thing wrong from my
Hi,
While excuting the word count mapreduce program , input file is 95.2 MB.
the error occures like this Error: Java Heap space
I have added -D mapred.child.java.opts=Xmx4096M in runtime also,
but the error has not solve .
In code also i have written conf.mapred.map.java.task=Xmx512M
for
not sure but i think you need to write =-Xmx, you forgot the dash..
2013/12/30 Ranjini Rathinam ranjinibe...@gmail.com
Hi,
While excuting the word count mapreduce program , input file is 95.2 MB.
the error occures like this Error: Java Heap space
I have added -D
json string to java object and then java object to json string
then
conf.set(yourkey,jsonStr);
--
-
BestWishes??
Blog:http://snv.iteye.com/
Email:1134687...@qq.com
-- Original --
From: unmesha
Not unique to hdfs. The same thing would happen on your local file system
or anywhere and any way you store the state of the object outside of the
JVM. That is why singletons should not be serializable.
Chris
On Dec 30, 2013 5:46 AM, unmesha sreeveni unmeshab...@gmail.com wrote:
I am trying to
Are you using the MultipleOutputs class shipped with Apache Hadoop or
one of your own?
If its the latter, please take a look at gotchas to take care of
described at
http://wiki.apache.org/hadoop/FAQ#Can_I_write_create.2Fwrite-to_hdfs_files_directly_from_map.2Freduce_tasks.3F
On Mon, Dec 30,
Hi
I am using below instruction set to set up hadoop cluster.
https://www.dropbox.com/s/05aurcp42asuktp/Chiu%20Hadoop%20Pig%20Install%20Instructions.docx
and to download hadoop using
*https://www.dropbox.com/s/znonl6ia1259by3/hadoop-1.1.2.tar.gz
I don't know any example of IIS log files. But from what you described, it
looks like analyzing one line of log data depends on some previous lines data.
You should be more clear about what is this dependence and what you are trying
to do.
Just based on your questions, you still have different
What's wrong to download it from the Apache official website?
http://archive.apache.org/dist/hadoop/core/hadoop-1.1.2/
Yong
Date: Mon, 30 Dec 2013 11:42:25 -0500
Subject: Unable to access the link
From: navaz@gmail.com
To: user@hadoop.apache.org
Hi
I am using below instruction set to set up
Thanks. But I am following the steps mentioned in the above file. Also i
am interested use the same wordcount program and gettysburg.txt which is
used in the instructions set.
On Mon, Dec 30, 2013 at 11:49 AM, java8964 java8...@hotmail.com wrote:
What's wrong to download it from the Apache
Thanks Harsh.
@Are you using the MultipleOutputs class shipped with Apache Hadoop or
one of your own?
I am using Apache Hadoop's multipleOutputs.
But as you see in stack trace, it's not appending the attempt id to file
name, it's only consists of task id.
Thanks Regards,
B Anil Kumar.
On
The (509) error is telling you what the problem is:
This account's public links are generating too much traffic and have been
temporarily disabled!
Which seems to mean, since it is Dropbox. there has been too much traffic
directed towards the file or to other public links owned by the hoster's
Hi,
I am trying to puzzle this out, and am hoping for some insight - I have an
IMAP inbox dump that I am analyzing - I need to track how many times a
given item is referred to in the inbox, i.e. how many emails came in about
that thing and over what time. I can load it into MapReduce as
You can find subscribe mail Ids on this page:
http://hadoop.apache.org/mailing_lists.html
On Mon, Dec 30, 2013 at 12:10 AM, sunqp qipeng@gmail.com wrote:
I think if the task fails, the output related to that task will be clean up
before the second attempt. I am guessing you have this exception is because
you have two reducers tried to write to the same file. One thing you need
to be aware of is that all data that is supposed to be in the same file
Thank you. Now the link is up. I have saved the file which i required.
On Mon, Dec 30, 2013 at 12:43 PM, Devin Suiter RDX dsui...@rdx.com wrote:
The (509) error is telling you what the problem is:
This account's public links are generating too much traffic and have
been temporarily
Thanks, Yong!
The dependence never cross files, but since HDFS splits files into blocks,
it may cross blocks, which makes it difficult to write MR job. I don't
quite understand what you mean by WholeFileInputFormat . Actually, I have
no idea how to deal with dependence across blocks.
2013/12/31
Google Hadoop WholeFileInputFormat or search it in book Hadoop: The
Definitive Guide
Yong
Date: Tue, 31 Dec 2013 09:39:58 +0800
Subject: Re: any suggestions on IIS log storage and analysis?
From: raofeng...@gmail.com
To: user@hadoop.apache.org
Thanks, Yong!
The dependence never cross files,
Hi,
I would like to know if the MRv2 provide the following commands through
the bash command line:
- get the number of jobs running?
- get the percentage of job completion of jobs?
- get the number of jobs that are waiting to be submitted?
--
Thanks,
ui or hadoop job command like??hadoop job -list
--
-
BestWishes??
??
Blog:http://snv.iteye.com/
Email:1134687...@qq.com
-- Original --
From: xeon;psdc1...@gmail.com;
Date: Tue, Dec 31, 2013
Generally, MRv2 indicates Yarn. you can try:
yarn application
then there are full help lists.
On Tue, Dec 31, 2013 at 12:32 PM, 小网客 smallnetvisi...@foxmail.com wrote:
ui or hadoop job command like:hadoop job -list
--
-
i have used the dash(-) but still the error is coming. Not ablre to fix.
Please help to fix it.
Regards,
Ranjini
On Mon, Dec 30, 2013 at 5:24 PM, Dieter De Witte drdwi...@gmail.com wrote:
not sure but i think you need to write =-Xmx, you forgot the dash..
2013/12/30 Ranjini Rathinam
Hi,
I want to compare the value from one hbase table to another hbase table
value , and need to add one column as valid indicator
if value is matching mark the field has 0 or not matching means 1.
i have used Filter command in mapreduce code
but the column is not printing in hbase table.
Hey Devin,
Are you perhaps looking for http://james.apache.org/mime4j/? You may have
to adapt it for MR but I don't imagine that would be too difficult to do.
On Mon, Dec 30, 2013 at 11:59 PM, Devin Suiter RDX dsui...@rdx.com wrote:
Hi,
I am trying to puzzle this out, and am hoping for some
-user@hadoop (bcc)
Please ask HBase questions on its own lists (u...@hbase.apache.org,
you may have to subscribe)
You're constructing a Put object. Do you then call table.put(obj) to
actually send it to the table?
On Tue, Dec 31, 2013 at 11:31 AM, Ranjini Rathinam
ranjinibe...@gmail.com wrote:
30 matches
Mail list logo