date:20130616

hprof profiler output location

2013-06-16 Thread YouPeng Yang

Hi All I want to profile a fraction of the tasks in a job,so I configured my job as [1]. However I could not get the hprof profiler output on the host on which I submitted my job.(I use MRv2 with YARN --CDH4.1.2---) Where can I find the hprof profiler output? [1] job.setProfileEnabl

MRunit DOWNLOAD URLs are unavailable

2013-06-16 Thread YouPeng Yang

HI All I want to report that MRunit DOWNLOAD URLs are unavailable。 http://www.apache.org/dyn/closer.cgi/incubator/mrunit/ Could anyone give me another available URL Regard Thank you.

Re: MRunit DOWNLOAD URLs are unavailable

2013-06-16 Thread Jagat Singh

http://mrunit.apache.org/general/downloads.html On Jun 16, 2013 8:20 PM, "YouPeng Yang" wrote: > HI All > > I want to report that MRunit DOWNLOAD URLs are unavailable。 > http://www.apache.org/dyn/closer.cgi/incubator/mrunit/ > > Could anyone give me another available URL > > Regard > >

HDFS file reader and buffering

2013-06-16 Thread John Lilley

Do the HDFS file-reader classes perform internal buffering? Thanks John

Re: HDFS file reader and buffering

2013-06-16 Thread Harsh J

Yes they do maintain a buffer equal to the configurable size of io.file.buffer.size (4k default) for both reads and writes. On Sun, Jun 16, 2013 at 7:03 PM, John Lilley wrote: > Do the HDFS file-reader classes perform internal buffering? > > Thanks > > John > > > > -- Harsh J

Re: how to get the mapreduce code which was pig/hive script translated to?

2013-06-16 Thread Harsh J

This is a question for the Hive/Pig lists to answer best. Note though that they only compile a plan, not the code. The code is available already, the compiled plan just structures the execution flow. If you take a look at the sources, you'll find the bits and pieces that get linked together depend

Assigning the same partition number to the mapper output

2013-06-16 Thread Maysam Hossein Yabandeh

Hi, I was wondering if it is possible in hadoop to assign the same partition numbers to the map outputs. I am running a map-only job (with zero reducers) and hadoop shuffles the partitions in the output: i.e. input/part-m-X is processed by task number Y and hence generates output/part-m-000

Re: About hadoop-2.0.5 release

2013-06-16 Thread Roman Shaposhnik

On Tue, Jun 11, 2013 at 11:22 PM, Ramya S wrote: > Hi, > > When will be the release of stable version of hadoop-2.0.5-alpha? hadoop-2.0.5-alpha has been released last week and can be obtained either in its source form: http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.0.5-alpha/ or

webhdfs kerberos checksum failed

2013-06-16 Thread Lanati, Matteo

Hi all, I'm trying to setup webhdfs on Hadoop 1.20 with security. I added the following to hdfs-site.xml dfs.webhdfs.enabled true dfs.web.authentication.kerberos.principal HTTP/master.hadoop.lo...@hadoop.lrz.de dfs.web.authentication.kerberos.keytab /home/

Re: how to get the mapreduce code which was pig/hive script translated to?

2013-06-16 Thread Edward Capriolo

Hive serializes the entire plan into an XML file if you set the log 4j settings to debug you should get the locations to the files itgenerates before launching the job. On Sun, Jun 16, 2013 at 11:08 AM, Harsh J wrote: > This is a question for the Hive/Pig lists to answer best. > > Note though t

Re: how to get the mapreduce code which was pig/hive script translated to?

2013-06-16 Thread Marcos Luis Ortiz Valmaseda

Edward is right. With log4j, you can see that. Here, you have the example: https://github.com/apache/hadoop-common/blob/HADOOP-3628/conf/log4j.properties The relevant info in the docs: http://hadoop.apache.org/docs/stable/cluster_setup.html#Logging Some working examples: http://stackoverflow.com/

RE: how to design the mapper and reducer for the below problem

2013-06-16 Thread John Lilley

I don't think can be done in a single map/reduce pass. Here the author discusses an implementation in PIG: http://techblug.wordpress.com/2011/08/07/transitive-closure-in-pig/ john From: parnab kumar [mailto:parnab.2...@gmail.com] Sent: Thursday, June 13, 2013 10:42 PM To: user@hadoop.apache.org S

RE: how to design the mapper and reducer for the below problem

2013-06-16 Thread John Lilley

Sorry this is the link I meant: http://hortonworks.com/blog/transitive-closure-in-apache-pig/ john From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Sunday, June 16, 2013 1:02 PM To: user@hadoop.apache.org Subject: RE: how to design the mapper and reducer for the below problem I don't th

RE: How to design the mapper and reducer for the following problem

2013-06-16 Thread John Lilley

You basically have a "record similarity scoring and linking" problem -- common in data-quality software like ours. This could be thought of as computing the cross-product of all records, counting the number of hash keys in common, and then outputting those that exceed a threshold. This is very

RE: How to design the mapper and reducer for the following problem

2013-06-16 Thread John Lilley

On further thought, it would be simpler to augment Reducer1 to use disk when it does not fit into memory. Nested looping over the disk file is sequential and will be fast. Then you can avoid the distributed join. john From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Sunday, June 16, 2

Re: how to get the mapreduce code which was pig/hive script translated to?

2013-06-16 Thread Lance Norskog

Both Pig and Hive have an 'explain plan' command that prints a schematic version. This might make it easier to see what M/R algorithms are used. Mostly the data goes through single-threaded transforms inside a mapper or reducer. https://cwiki.apache.org/Hive/languagemanual-explain.html On 06/

RE: Assigning the same partition number to the mapper output

2013-06-16 Thread Devaraj k

If you are using TextOutputFormat for your job, getRecordWriter() (i.e RecordWriter org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TaskAttemptContext job) throws IOException, InterruptedException) method uses FileOutputFormat.getDefaultWorkFile() for generating the fi

hprof profiler output location

MRunit DOWNLOAD URLs are unavailable

Re: MRunit DOWNLOAD URLs are unavailable

HDFS file reader and buffering

Re: HDFS file reader and buffering

Re: how to get the mapreduce code which was pig/hive script translated to?

Assigning the same partition number to the mapper output

Re: About hadoop-2.0.5 release

webhdfs kerberos checksum failed

Re: how to get the mapreduce code which was pig/hive script translated to?

Re: how to get the mapreduce code which was pig/hive script translated to?

RE: how to design the mapper and reducer for the below problem

RE: how to design the mapper and reducer for the below problem

RE: How to design the mapper and reducer for the following problem

RE: How to design the mapper and reducer for the following problem

Re: how to get the mapreduce code which was pig/hive script translated to?

RE: Assigning the same partition number to the mapper output

17 matches

Site Navigation

Mail list logo

Footer information