Thank you all. In fact, I don't expect that this way can help to enhance
the performance.
I need to process 3 different logs (with different format). I just want
to sart all these 3 logs processing at the same time , all in just this one
program. but I can give different separator to each
I suppose you could also leverage job configuration or per-input
mapper impl. via MultipleInputs to do this.
On Thu, Dec 13, 2012 at 5:44 PM, Yu Yang clouder...@gmail.com wrote:
Thank you all. In fact, I don't expect that this way can help to enhance
the performance.
I need to process 3
Hi,
There only 2 types of map output files, Sequence and Text files. If
those files are going to be used as input to several reduce tasks,
they need to be partitioned into blocks. Is there any SEPARATOR bits
that limits each partition? Can I read a specific partition of a map
output file? Is
Map output files, by which you perhaps mean intermediate data files
for temporary K/V persistence, are stored in IFiles. They do not use
text nor sequence files (historically though, they did use sequence
files at some point).
You can read the IFile's sources at
Hello Pedro,
The first part of your question is very well covered by Harsh.
For the second part, the generation and no. of partitions is governed by
the getPartition() Method present in the 'Partition' Interface. The default
behavior is to create partitions based on Hashing. You can have
I agree with Harsh.
Regards,
Mohammad Tariq
On Thu, Dec 13, 2012 at 12:26 PM, Harsh J ha...@cloudera.com wrote:
If your production target is bit far away, I'd encourage setting up
and using the 2.x based releases for its feature set that may aid you
in your design. We'll be releasing
Harsh,
can you please tell will 2.0.3 release be ready to the end of Jan 2013?
Regards,
Ivan
2012/12/13 Harsh J ha...@cloudera.com
If your production target is bit far away, I'd encourage setting up
and using the 2.x based releases for its feature set that may aid you
in your design.
I do feel so. This is the ongoing discussion with further details:
http://search-hadoop.com/m/4U27S1Zf9eF1
On Thu, Dec 13, 2012 at 2:53 PM, Ivan Ryndin iryn...@gmail.com wrote:
Harsh,
can you please tell will 2.0.3 release be ready to the end of Jan 2013?
Regards,
Ivan
2012/12/13 Harsh
Dear All,
I was looking at options for reducing the overall cost of storage that is
incurred due to replication of data across the datanodes for higher
availability and data localization for processing.
I stumbled on a few articles suggesting erasure coding (software-raid) as one
such
Hello Guys,
Now that I have downloaded the hadoop 1.1.1 source tar ball, I am trying to
compile it for my platform (s390) running SLES 11.
I am encountering a couple of problem for which I have some questions:
1) Is there an official guide from the hadoop project showing how to build a
binary
After all my RD, I have setup hadoop 0.22.0 succesfully. Right now, I am
using Eclipse Indigo Service Release 2 and hadoop 0.22.0 on win 7. Trying
to use the eclipse plugin provided in the Hadoop package but that does not
seem to work. When I try adding a new Hadoop Location, i get an error
Hello Guys,
Now that I have downloaded the hadoop 1.1.1 source tar ball, I am trying to
compile it for my platform (s390) running SLES 11.
I am encountering a couple of problem for which I have some questions:
1) Is there an official guide from the hadoop project showing how to build a
binary
branch1 does not use maven but ant.
There are some docs here:
http://wiki.apache.org/hadoop/BuildingHadoopFromSVN, not sure it's totally
up to date.
On Thu, Dec 13, 2012 at 11:08 AM, Emile Kao emile...@gmx.net wrote:
3) Can I compile the package in a simpler way other then maven?
Fyi, I compiles 1.0.3 successfully using ant last week. So steps seems
still to be good.
JM
Le 13 déc. 2012 05:28, Nicolas Liochon nkey...@gmail.com a écrit :
branch1 does not use maven but ant.
There are some docs here:
http://wiki.apache.org/hadoop/BuildingHadoopFromSVN, not sure it's
Great news, thanks a lot,
So, yes, our production date would be around may or june 2013, so, do you
think we would have a productive stable 2.x version for these days.
Thanks a lot,
Hernan
2012/12/13 Harsh J ha...@cloudera.com
I do feel so. This is the ongoing discussion with further
Hello Guys,
Now that I have downloaded the hadoop 1.1.1 source tar ball, I am trying to
compile it for my platform (s390) running SLES 11.
I am encountering a couple of problem for which I have some questions:
1) Is there an official guide from the hadoop project showing how to build
a binary for
Hi,
Take a look here: I think you should be using and instead...
http://wiki.apache.org/hadoop/QwertyManiac/BuildingHadoopTrunk#Building_branch-1
JM
Hi,
I am relatively new to Hadoop and completely new to SSL encryption. I am
having issues getting encrypted shuffle working on a small test cluster
with Mapreduce V1. I am using self signed certificates I generated with the
java keytool. I followed the instructions on the site Apache Hadoop
This is a dated blog post, so it would help if someone with current HDFS
knowledge can validate it:
http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_hadoop_dist/
.
There is a bit about the RAM required for the Namenode and how to compute
it:
You can look at the 'Namespace
Hi all,
I am relatively new to Hadoop and want to do pre-commit in branch-1 before
check patch into community,
however, there is no pre-commit job in community jenkins.
Could anyone have any good suggestion or community jenkins can help?
Thanks in advance!!
Regards,
Wenwu,Peng
I do think running a build would cause that. Doing an 'ant clean'
would resolve it for you.
But I agree that the default version may perhaps be the release itself
(although then it gets harder to identify that the user ran a build?),
please file a JIRA for further discussion.
On Fri, Dec 14,
I'm submitting unrelated jobs programmatically (using AWS EMR) so they run
in parallel.
I'd like to run an s3distcp job in parallel as well, but the interface to
that job is a Tool, e.g. ToolRunner.run(...).
ToolRunner blocks until the job completes though, so presumably I'd need to
create a
i did bin/hadoop tasktracker and that started it :)
Thanks, andy
On Thu, Dec 13, 2012 at 10:40 PM, Kartashov, Andy andy.kartas...@mpac.cawrote:
#service –status-all
David,
You try like below instead of runJob() you can try submitJob().
JobClient jc = new JobClient(job);
jc.submitJob(job);
Cheers!
Manoj.
On Fri, Dec 14, 2012 at 10:09 AM, David Parks davidpark...@yahoo.comwrote:
I'm submitting unrelated jobs programmatically (using AWS EMR) so they
Can you show some sample code of submitting distcp job?
Cheers!
Manoj.
On Fri, Dec 14, 2012 at 11:44 AM, David Parks davidpark...@yahoo.comwrote:
Can I do that with s3distcp / distcp? The job is being configured in the
run() method of s3distcp (as it implements Tool). So I think I can’t
25 matches
Mail list logo