Re: json writablecomparable

2010-06-30 Thread Ted Yu
Interesting: http://www.umiacs.umd.edu/~jimmylin/Cloud9/docs/content/data-types.html You need to define your own custom comparator. On Tue, Jun 29, 2010 at 10:41 PM, Oded Rotem wrote: > Hi, > > > > Is there a json writablecomparable implementation anywhere? > > > > Thanks, > > Oded > > > > > >

Re: json writablecomparable

2010-06-30 Thread Ted Yu
> implementation... > > -Original Message- > From: Ted Yu [mailto:yuzhih...@gmail.com] > Sent: Wednesday, June 30, 2010 7:03 PM > To: common-user@hadoop.apache.org > Subject: Re: json writablecomparable > > Interesting: > http://www.umiacs.umd.edu/~jimmylin/Cloud9/docs/con

Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Ted Yu
I found https://issues.apache.org/jira/browse/HADOOP-6812 You can add the following to core-site.xml: fs.inmemory.size.mb 100 Default value is 100: int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100")); ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java On Thu, Jul 1,

Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Ted Yu
useful parameter IMHO. > > Anyone knows about this? Thanks in advance! > > Best Regards, > Carp > > 2010/7/2 Ted Yu > > > I found https://issues.apache.org/jira/browse/HADOOP-6812 > > > > You can add the following to core-site.xml: > > > > fs.i

Re: Help Regarding MAPREDUCE

2010-07-01 Thread Ted Yu
Take a look at org.apache.hadoop.mapred.TaskTracker.MapOutputServlet which is inside src/mapred/org/apache/hadoop/mapred/TaskTracker.java On Thu, Jul 1, 2010 at 8:53 AM, Ahmad Shahzad wrote: > Hi, > Can anyone tell me that which directories in the hadoop core directory > should i look at if

Re: Intermediate files generated.

2010-07-08 Thread Ted Yu
The first part of the statements isn't necessarily correct - SequenceFile is written to hdfs. On Thu, Jul 8, 2010 at 4:29 PM, Pramy Bhats wrote: > Correct me, If I am wrong. The output of the Mappers go to local file > system. And reducers, later fetches the output of Mappers. > > If the above st

Re: Help with Hadoop runtime error

2010-07-09 Thread Ted Yu
Please see the description about xcievers at: http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#requirements You can confirm that you have a xcievers problem by grepping the datanode logs with the error message pasted in the last bullet point. On Fri, Jul 9, 2010 at 1:10 PM, Raymond

Re: Help with Hadoop runtime error

2010-07-09 Thread Ted Yu
reinstalled linux (same version) > and > moved from hadoop 0.20.1 to 0.20.2. > > > > - Original Message > From: Ted Yu > To: common-user@hadoop.apache.org > Sent: Fri, July 9, 2010 4:26:30 PM > Subject: Re: Help with Hadoop runtime error > > Please see

Re: help with hadoop source code

2010-07-11 Thread Ted Yu
http://hadoop.apache.org/common/version_control.html#Anonymous+Access+%28read-only%29 On Sat, Jul 10, 2010 at 11:13 PM, Rahul.V. wrote: > Hi, > Am trying to implement a small map reduce framework on another file system. > A very elementary one at that. > So I am trying to access the Hadoop source

Re: Terasort problem

2010-07-11 Thread Ted Yu
mapred.tasktracker.reduce.tasks.maximum and mapred.tasktracker.map.tasks.maximum are configured in mapred-site.xml They're cluster-wide. Hadoop would sync configuation from name node to data nodes upon startup, you don't need to configure for individual datanode. "Too many fetch-failures..." erro

Re: newly built Jar file

2010-07-11 Thread Ted Yu
It should be in hadoop-core jar: tyumac:hadoop-0.20.2+320 tyu$ jar tvf build/hadoop-core-0.20.2-CDH3b2-SNAPSHOT.jar | grep PlatformName 1048 Fri Jul 02 11:31:04 PDT 2010 org/apache/hadoop/util/PlatformName.class Did you use "ant jar" command to build ? On Sun, Jul 11, 2010 at 3:09 PM, Pramy Bha

Re: map.input.file in 20.1

2010-07-12 Thread Ted Yu
How about: FileSplit fileSplit = (FileSplit) context.getInputSplit(); String sFileName = fileSplit.getPath().getName(); On Mon, Jul 12, 2010 at 2:56 PM, David Hawthorne wrote: > I'm trying to get the name of the file that the map job is operating on out > of the Context passed to the setup funct

Re: java.lang.OutOfMemoryError: Java heap space

2010-07-12 Thread Ted Yu
Normally task tracker isn't run on Name node. Did you configure otherwise ? On Mon, Jul 12, 2010 at 3:06 PM, Shuja Rehman wrote: > *Master Node output:* > > total used free sharedbuffers cached > Mem: 2097328 5155761581752 0 56060 254760

Re: Debuging hadoop core

2010-07-13 Thread Ted Yu
Find hadoop-site.xml which Eclipse claimed was in your classpath. In the same directory, look for core-site.xml and add the following: fs.default.name hdfs://sjc9-flash-grid04.ciq.com:9000 On Tue, Jul 13, 2010 at 3:07 PM, Pramy Bhats wrote: > Hi, > > I am trying to debug the new built hadoop

Re: Debuging hadoop core

2010-07-14 Thread Ted Yu
in me the reasoning behind this ? > > thanks, > --PB > > On Wed, Jul 14, 2010 at 5:09 AM, Ted Yu wrote: > > > Find hadoop-site.xml which Eclipse claimed was in your classpath. > > In the same directory, look for core-site.xml and add the following: > > &

Re: This file system object ...does not support access to the request path ...

2010-07-14 Thread Ted Yu
As the error message suggested, call LocalFileSystem.get() Regards On Wed, Jul 14, 2010 at 8:21 AM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Hey guys, > > I'm running a S3->EMR job that needs to save some temp files in a > local dir. Unfortunately, I'm getting this message: > > j

Re: Hadoop's datajoin

2010-07-14 Thread Ted Yu
Please read the source code of DataJoinJob.java Then you would know that the last parameter should be the number of reducers. On Wed, Jul 14, 2010 at 2:33 AM, Denim Live wrote: > Hi, > > Thanks. I have located the datajoin jar. Now I execute the progam the same > way > as specified in the readme

Re: Hadoop Streaming (with Python) and Queue's

2010-07-14 Thread Ted Yu
If you're using capacity scheduler, see: http://hadoop.apache.org/common/docs/r0.20.2/capacity_scheduler.html#Setting+up+queues The queues can be checked through job tracker web UI under Scheduling Information section On Wed, Jul 14, 2010 at 9:57 AM, Moritz Krog wrote: > I second that observatio

Re: Killed : GC overhead limit exceeded

2010-07-16 Thread Ted Yu
Have you tried increasing memory beyond 1GB for your map task ? I think you have noticed that both OOME came from Pattern.compile(). Please take a look at http://www.docjar.com/html/api/java/lang/String.java.html I would suggest pre-compiling the three patterns when setting up your mapper - basi

Re: Killed : GC overhead limit exceeded

2010-07-18 Thread Ted Yu
Pattern.split(value.toString()) { > >String[] values = tabPattern.split(line); > >for (int i=0; i,values.length; i++) { >values[i] = spacePattern.matcher(values[i]).replaceAll(""); >} >parser.setvals(values); > >

Re: empty input

2010-07-26 Thread Ted Yu
No. There is no InputSplit generated from them. On Mon, Jul 26, 2010 at 12:37 PM, Gang Luo wrote: > Hi all, > assume some of my files are empty (size is 0) and I name them as the input > to my > MR job, will a map task be launched on each of them? > > Thanks, > -Gang > > > > >

Re: what affects number of reducers launched by hadoop?

2010-07-28 Thread Ted Yu
The 3 stages for reducer are: copy sort reduce On Wed, Jul 28, 2010 at 12:24 PM, Vitaliy Semochkin wrote: > Hi, > > in my cluster mapred.tasktracker.reduce.tasks.maximum = 4 > however during monitoring the job in job tracker I see only 1 reducer > working > > first it is > reduce > copy - can som

Re: error:Caused by: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzopCodec

2010-07-29 Thread Ted Yu
Yes. On Thu, Jul 29, 2010 at 7:57 AM, Alex Luya wrote: > Hi, > >Run:ps -aef | grep -i tasktracker > I got this: > > - > alex 2425 1 0 22:34 ?00:00:05 > /usr/local/hadoop/jdk1.6.0

Re: mapred.userlog.retain.hours

2010-07-29 Thread Ted Yu
Have you restarted your cluster ? You can actually specify this parameter in JobConf. See the usage: TaskLog.cleanup(job.getInt("mapred.userlog.retain.hours", 24)); ./src/mapred/org/apache/hadoop/mapred/Child.java On Thu, Jul 29, 2010 at 10:30 AM, vishalsant wrote: > > I have chnaged on

Re: How to make Hadoop listen on multiple network interfaces ?

2010-07-30 Thread Ted Yu
Hadoop uses DNS lookup to associate IP with hostname. It would be better if you follow the rack concept and have a ToR (Top of Rack) switch which allows for port bonding. See HBASE-2502 also. On Thu, Jul 29, 2010 at 11:57 PM, 杨杰 wrote: > Hi, everyone, > > We are now trying building a hadoop cl

Re: How to get job configuration from external source

2010-08-12 Thread Ted Yu
Check $HADOOP_HOME/logs directory where you can find: job_201008102250_0035_conf.xml job_201008102250_0092_conf.xml ... On Thu, Aug 12, 2010 at 10:58 AM, patek tek wrote: > Hi All, > I am building a monitoring tool for my Hadoop cluster. I have been able to > collect most of the data I need fr

Re: what "role" should I assign the weakest node in a hadoop cluster

2010-08-19 Thread Ted Yu
It depends on how weak this node is because NameNode is SPOF. On Thu, Aug 19, 2010 at 2:13 PM, Alejandro Montenegro wrote: > Hi, > Im working on a proof of concept with hadoop and I have got a little lab > with 4 machine, one of those has half of ram memory than the rest so I > would > llike to a

Re: add new data node failed: incompatible build version (CDH3)

2010-08-23 Thread Ted Yu
CDH 0.20.2+320 isn't compatible with 0.20.2+228 You need to upgrade whole cluster. On Mon, Aug 23, 2010 at 2:00 PM, jiang licht wrote: > The version for the current cluster is Cloudera 0.20.2+228 > > A newer version of CDH 0.20.2+320 is installed on a new machine to be used > as a new datanode.

Re: command to start and stop balancer?

2010-08-24 Thread Ted Yu
start-balancer.sh calls 'start balancer' They have same effect. On Tue, Aug 24, 2010 at 12:13 PM, jiang licht wrote: > In current hadoop documentation, it is "hadoop balancer [-threshold > ]" to start a balancer and to stop the balancer press ctrl-c. > > But in some other places (YDN and older h

Re: quota?

2010-08-25 Thread Ted Yu
Refer to http://hadoop.apache.org/common/docs/r0.20.0/hdfs_quota_admin_guide.html#Space+Quotas On Wed, Aug 25, 2010 at 3:43 PM, jiang licht wrote: > Is it possible to tell hadoop to restrict space usage of a specific dfs > folder in the cluster, e.g. a user home directory (/user/accountA in dfs)

Re: data in compression format affect mapreduce speed

2010-08-25 Thread Ted Yu
Compressed data would increase processing time in mapper/reducer but decrease the amount of data transferred between tasktracker nodes. Normally you should consider applying some form of compression. On Wed, Aug 25, 2010 at 7:32 PM, shangan wrote: > will data stored in compression format affect

Re: JIRA down

2010-08-25 Thread Ted Yu
In case you need to access JIRA tonight, google JIRA number and click on Cached link. You would see: http://webcache.googleusercontent.com/search?q=cache:Tgi71phHrUoJ:https://issues.apache.org/jira/browse/HBASE-2893+hbase+metadata+layer&cd=4&hl=en&ct=clnk&gl=us&client=firefox-a On Wed, Aug 25, 20

Re: Why does Generic Options Parser only take the first -D option?

2010-09-02 Thread Ted Yu
I checked GenericOptionsParser from 0.20.2 processGeneralOptions() should be able to process all -D options: if (line.hasOption('D')) { * String[] property = line.getOptionValues('D'); * for(String prop : property) { String[] keyval = prop.split("=", 2); if (keyval.le

Re: Sort with customized input/output !!

2010-09-07 Thread Ted Yu
Please get hadoop source code and read the comment at the beginning of SequenceFile.java: * Essentially there are 3 different formats for SequenceFiles ... On Tue, Sep 7, 2010 at 8:13 PM, Matthew John wrote: > Hey , > M pretty new to Hadoop . > > I need to Sort a Metafile (TBs) and thought of us

Re: My mappers stop responding even though they reach 100%

2010-09-09 Thread Ted Yu
Does your mapper access external resources which may take some time to return ? On Thu, Sep 9, 2010 at 11:28 AM, Pavel Gutin wrote: > I've been having a problem for the past few weeks. I will kick off a > job that will have a bunch of map tasks. At some point in the job > (sometimes at 1%, somet

Re: A new way to merge up those small files!

2010-09-25 Thread Ted Yu
Edward: Thanks for the tool. I think the last parameter can be omitted if you follow what hadoop fs -text does. It looks at a file's magic number so that it can attempt to *detect* the type of the file. Cheers On Fri, Sep 24, 2010 at 11:41 PM, Edward Capriolo wrote: > Many times a hadoop job pr

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-09-26 Thread Ted Yu
Have you tried lowering mapred.job.reuse.jvm.num.tasks ? On Sun, Sep 26, 2010 at 3:30 AM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Nope, that didn't seem to help. > > On Sun, Sep 26, 2010 at 1:00 AM, Bradford Stephens > wrote: > > I'm going to try running it on high-RAM boxes wit

Re: java.lang.OutOfMemoryError: GC overhead limit exceeded

2010-09-26 Thread Ted Yu
gt; > On Sun, Sep 26, 2010 at 6:47 AM, Ted Yu wrote: > > Have you tried lowering mapred.job.reuse.jvm.num.tasks ? > > > > On Sun, Sep 26, 2010 at 3:30 AM, Bradford Stephens < > > bradfordsteph...@gmail.com> wrote: > > > >> Nope, that didn't seem

Re: Proper blocksize and io.sort.mb setting when using compressed LZO files

2010-09-27 Thread Ted Yu
The setting should be fs.inmemory.size.mb On Mon, Sep 27, 2010 at 7:15 AM, pig wrote: > HI Sriguru, > > Thank you for the tips. Just to clarify a few things. > > Our machines have 32 GB of RAM. > > I'm planning on setting each machine to run 12 mappers and 2 reducers with > the heap size set to

Re: Proper blocksize and io.sort.mb setting when using compressed LZO files

2010-09-27 Thread Ted Yu
pache.org/common/docs/r0.20.2/hdfs-default.html > > Should this be something that needs to be added? > > Thank you for the help! > > ~Ed > > On Mon, Sep 27, 2010 at 11:18 AM, Ted Yu wrote: > > > The setting should be fs.inmemory.size.mb > > > > On Mon, S

Re: java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)

2010-09-29 Thread Ted Yu
Your MsRead.readFields() doesn't contain readInt(). Can you show us the lines around line 84 of MsRead.java ? On Wed, Sep 29, 2010 at 2:44 PM, Tali K wrote: > > HI All, > > I am getting this Exception on a cluster(10 nodes) when I am running > simple hadoop map / reduce job. > I don't have thi

Re: java.lang.RuntimeException: java.io.EOFException at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:103)

2010-09-30 Thread Ted Yu
Line 84 is empty. Line 83 is: out.writeUTF(query_id); Please send the stack trace that corresponds to your attachment. >From previous discussion: In the very begining of readFields(), clear all available fields (lists, primitives, etc). The best way to to do that is to create a clearFie

Re: Passing commandline arguments to Mapper class

2010-10-02 Thread Ted Yu
When you are passing a commandline argument while running the job (using '-D'), then it is set for you in the configuration whenever job's configuration is created. On Sat, Oct 2, 2010 at 4:35 PM, coder22 wrote: > > I need to pass commandline arguments to Mapper Class. > > > public class Size ex

Re: conf.setCombinerClass in Map/Reduce

2010-10-06 Thread Ted Yu
If input to reducer is of , the combiner would take in and emit . On Tue, Oct 5, 2010 at 10:03 PM, Shi Yu wrote: > Hi, thanks for the answer, Antonio. > > I have found one of the main problem. It was because I used the > MultipleOutputs in the Reduce class, so when I set the Combiner and the >

Re: What is the best to terminate a Map job without it being retried

2010-10-08 Thread Ted Yu
How about deleting/moving the dirty files in your mapper or in another job ? On Fri, Oct 8, 2010 at 4:30 PM, Steve Kuo wrote: > I have a collection of dirty data files, which I can detect during the > setup() phase of my Map job. It would be best that I can quit the map job > and prevent it fro

Re: FUSE HDFS significantly slower

2010-10-25 Thread Ted Yu
https://issues.apache.org/jira/browse/HADOOP-3805 tried to mitigate this problem. On Mon, Oct 25, 2010 at 10:17 PM, aniket ray wrote: > Hi, > > I'm seeing in my experiments that Fuse-HDFS is significantly slower (around > 3x slower) than using the Java hdfs API directly. > Wanted to ask if this

Re: setMaxMapAttempts() isn't working

2010-11-05 Thread Ted Yu
I checked cdh3b2. >From TaskInProgress: private void setMaxTaskAttempts() { if (isMapTask()) { this.maxTaskAttempts = conf.getMaxMapAttempts(); } else { It should work. What hadoop version are you using ? On Fri, Nov 5, 2010 at 3:55 PM, Keith Wiley wrote: > Has anyone else obser

Re: hadoop namespace size

2010-11-07 Thread Ted Yu
Take a look at http://your-name-node:50070/dfsnodelist.jsp?whatNodes=LIVE On Sun, Nov 7, 2010 at 9:37 AM, Null Ecksor wrote: > I have a small question... > > I have a small cluster setup of 20 nodes. I was wondering - How to check > the > namespace size for my cluster, or the details like how ma

Re: What are xcievers?

2010-11-07 Thread Ted Yu
Take a look at DataXceiverServer.java: /** * Server used for receiving/sending a block of data. * This is created to listen for requests from clients or * other DataNodes. This small server does not use the * Hadoop IPC mechanism. */ On Sun, Nov 7, 2010 at 12:43 PM, Abhinay Mehta wrote: >

Re: while loop issue in reduce

2010-12-04 Thread Ted Yu
Is a_el an ArrayList ? If not, you only get the last element's value. On Sat, Dec 4, 2010 at 9:50 AM, Ranjan Sen wrote: > > Hi > > I want to copy the values from the key and Iterator of value input in a > reduce function. I was using a ArrayList to copy the list content into. I > was using a > w

Re: Reduce Error

2010-12-08 Thread Ted Yu
Any chance mapred.local.dir is under /tmp and part of it got cleaned up ? On Wed, Dec 8, 2010 at 4:17 AM, Adarsh Sharma wrote: > Dear all, > > Did anyone encounter the below error while running job in Hadoop. It occurs > in the reduce phase of the job. > > attempt_201012061426_0001_m_000292_0: >

Re: Reduce Error

2010-12-08 Thread Ted Yu
and figure out >> if there are FS or permssion problems. >> >> Raj >> >> >> >> From: Adarsh Sharma >> To: common-user@hadoop.apache.org >> Sent: Wed, December 8, 2010 7:48:47 PM >> Subject: Re: Reduce Err

Re: Question about AvatarNode

2010-12-13 Thread Ted Yu
Check out the code on github You can find contrib/highavailability/src/java/org/apache/hadoop/hdfs/AvatarZooKeeperClient.java On Sun, Dec 12, 2010 at 11:54 PM, ChingShen wrote: > Hi all, > > I read the "Looking at the code behind our three uses of Apache Hadoop"( > http://www.facebook.com/note

Re: Please help with hadoop configuration parameter set and get

2010-12-17 Thread Ted Yu
You can use hadoop counter to pass this information. This way, you see the counters in job report. On Thu, Dec 16, 2010 at 10:58 PM, Peng, Wei wrote: > Hi, > > > > I am a newbie of hadoop. > > Today I was struggling with a hadoop problem for several hours. > > > > I initialize a parameter by set

Re: breadth-first search

2010-12-22 Thread Ted Yu
Modify the following parameters: mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum mapred.map.tasks mapred.reduce.tasks FYI you need to adjust the -Xmx for your mapper/reducer after increasing the values for above parameters On Wed, Dec 22, 2010 at 11:51 AM, Peng, Wei

Re: how to build hadoop in Linux

2011-01-01 Thread Ted Yu
For question #3, this should be helpful: http://ant.apache.org/manual/tasksoverview.html#compile On Sat, Jan 1, 2011 at 6:53 AM, Da Zheng wrote: > Happy new year! > > Thanks. After applying the patch, I can compile the code with > ant -Dforrest.home=/home/zhengda/apache-forrest-0.8 compile-core

Re: how to build hadoop in Linux

2011-01-01 Thread Ted Yu
x27;s actually related > to > my first question. For example, if I want to just compile the code (java > and the > related native C code) but not build documents, which target should I > choose? > > Best, > Da > > On 1/1/11 10:31 AM, Ted Yu wrote: > > For question #3

Re: Help: How to increase amont maptasks per job ?

2011-01-07 Thread Ted Yu
Set higher values for mapred.tasktracker.map.tasks.maximum (and mapred.tasktracker.reduce.tasks.maximum) in mapred-site.xml On Fri, Jan 7, 2011 at 12:58 PM, Tali K wrote: > > > > > We have a jobs which runs in several map/reduce stages. In the first job, > a large number of map tasks -82 are i

Re: Help: How to increase amont maptasks per job ?

2011-01-07 Thread Ted Yu
Check out mapred.map.tasks and mapred.reduce.tasks On Fri, Jan 7, 2011 at 1:40 PM, Tali K wrote: > > According to the documentation, that parameter is for the number of >tasks *per TaskTracker*. I am asking about the number of tasks >for the entire job and entire cluster. That paramete

Re: libjars options

2011-01-11 Thread Ted Yu
Refer to Alex Kozlov's answer on 12/11/10 On Tue, Jan 11, 2011 at 10:10 AM, C.V.Krishnakumar Iyer wrote: > Hi, > > Could anyone please guide me as to how to use the -libjars option in HDFS? > > I have added the necessary jar file (the hbase jar - to be precise) to the > classpath of the node whe

Re: Hive rc

2011-01-20 Thread Ted Yu
Check out processCmd() in CliDriver.java Basically .hiverc contains multiple lines of CLI commands. You can use backslash to spread one command across multiple lines. On Thu, Jan 20, 2011 at 2:25 PM, abhatna...@vantage.com < abhatna...@vantage.com> wrote: > > Hi > > Does anybody has idea of how

Re: How to get metrics information?

2011-01-22 Thread Ted Yu
You can use the following code: JobClient jc = new JobClient(jobConf); int numReduces = jc.getClusterStatus().getMaxReduceTasks(); For 0.20.3, you can use: ClusterMetrics metrics = jobTracker.getClusterMetrics(); On Sat, Jan 22, 2011 at 9:57 AM, Zhenhua Guo wrote: > I want t

Re: How to get metrics information?

2011-01-22 Thread Ted Yu
Guo wrote: > Thanks! > How to get JobTracker object? > > Gerald > > On Sun, Jan 23, 2011 at 5:46 AM, Ted Yu wrote: > > You can use the following code: > >JobClient jc = new JobClient(jobConf); > >int numReduces = jc.getClusterStatus().getMax

Re: Distributed indexing with Hadoop

2011-01-29 Thread Ted Yu
$MAHOUT_HOME/examples/bin/build-reuters.shFYI On Sat, Jan 29, 2011 at 12:57 AM, Marco Didonna wrote: > On 01/29/2011 05:17 AM, Lance Norskog wrote: > > Look at the Reuters example in the Mahout project: > http://mahout.apache.org > > Ehm could you point me to it ? I cannot find it > > Thanks > >

Re: Writing Reducer output to database

2011-02-03 Thread Ted Yu
At least in cdh3b2, there are two DBOutputFormat.java: ./src/mapred/org/apache/hadoop/mapred/lib/db/DBOutputFormat.java ./src/mapred/org/apache/hadoop/mapreduce/lib/db/DBOutputFormat.java You should be able to use the latter. On Thu, Feb 3, 2011 at 2:45 PM, Adeel Qureshi wrote: > I had started

Re: Writing Reducer output to database

2011-02-06 Thread Ted Yu
t; org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:809) >at > > org.apache.hadoop.mapred.MapTask$NewOutputCollector.(MapTask.java:549) >at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631) >at org.apache.hadoop.mapred.MapTask.run(MapTa

Re: Writing Reducer output to database

2011-02-06 Thread Ted Yu
a:1063) >at org.apache.hadoop.mapred.Child.main(Child.java:211) > > > On Sun, Feb 6, 2011 at 11:00 AM, Ted Yu wrote: > > > I think you have looked at > > src/examples/org/apache/hadoop/examples/DBCountPageView.java > > where: > >job.setMap

Re: A slave knows the other in hadoop?

2011-02-16 Thread Ted Yu
>> Need i put the public key of the first slave in authorized_keys of the second ? That is not needed. On Wed, Feb 16, 2011 at 10:19 AM, Sandro Simas wrote: > This is a simple question, a slave know the other slave in Hadoop ? > Need i put the public key of the first slave in authorized_keys of

Re: Externalizing Hadoop configuration files

2011-02-16 Thread Ted Yu
You need to develop externalization yourself. Our installer uses place holders such as: fs.checkpoint.dir dataFolderPlaceHolder/dfs/namesecondary,backupFolderPlaceHolder/namesecondary They would be resolved at time of deployment. On Wed, Feb 16, 2011 at 3:06 PM, Jing Zang wrote: > > How do

Re: Cost of bytecode execution in MapReduce

2011-02-16 Thread Ted Yu
Is your target development environment using C++ ? On Wed, Feb 16, 2011 at 9:49 PM, Matthew John wrote: > Hi all, > > I wanted to know if the Map/Reduce (Mapper and Reducer) code incurs > any fixed cost of ByteCode execution. And how do the mappers (say of > WordCount MR) look like in detail (in

Re: Cost of bytecode execution in MapReduce

2011-02-17 Thread Ted Yu
ost ? > I am not using any specific development environment now (like Eclipse) > . > > Matthew > > On Thu, Feb 17, 2011 at 11:52 AM, Ted Yu wrote: > > Is your target development environment using C++ ? > > > > On Wed, Feb 16, 2011 at 9:49 PM, Matthew John < >

Re: Cost of bytecode execution in MapReduce

2011-02-17 Thread Ted Yu
Are you investigating alternative map-reduce framework ? Please read: http://www.craighenderson.co.uk/mapreduce/ On Thu, Feb 17, 2011 at 9:45 AM, Matthew John wrote: > Hi Ted, > > Can u provide a link to the same ? Not able to find it :( . > > > On Thu, Feb 17, 2011 at 9:54

Re: hadoop-hdfs-client splitoff is going to break code

2015-10-14 Thread Ted Yu
+1 on option 2. On Wed, Oct 14, 2015 at 10:56 AM, larry mccay wrote: > Interesting... > > As long as #2 provides full backward compatibility and the ability to > explicitly exclude the server dependencies that seems the best way to go. > That would get my non-binding +1. > :) > > Perhaps we coul

Re: [VOTE] Release Apache Hadoop 2.6.2

2015-10-22 Thread Ted Yu
Ran hbase test suite (0.98 branch) by pointing to maven repo below. All tests passed. Cheers On Thu, Oct 22, 2015 at 2:14 PM, Sangjin Lee wrote: > Hi all, > > I have created a release candidate (RC0) for Hadoop 2.6.2. > > The RC is available at: http://people.apache.org/~sjlee/hadoop-2.6.2-RC0

Re: Disable some of the Hudson integration comments on JIRA

2015-11-26 Thread Ted Yu
Looking at a few Hadoop-trunk-Commit builds, I saw 'Some Enforcer rules have failed.' Below was from build #8895 : [WARNING] Dependency convergence error for org.apache.hadoop:hadoop-auth:3.0.0-SNAPSHOT paths to dependency are: +-org.apache.hadoop:hadoop-common:3.0.0-SNAPSHOT +-org.apache.hadoop

Re: [VOTE] Release Apache Hadoop 2.7.2 RC1

2015-12-17 Thread Ted Yu
Hi, I have run test suite for tip of hbase 0.98 branch against this RC. All tests passed. +1 On Wed, Dec 16, 2015 at 6:49 PM, Vinod Kumar Vavilapalli wrote: > Hi all, > > I've created a release candidate RC1 for Apache Hadoop 2.7.2. > > As discussed before, this is the next maintenance release

Re: is jenkins testing PRs?

2016-01-11 Thread Ted Yu
Once you log in, you can specify the YARN JIRA number using: https://builds.apache.org/job/PreCommit-yarn-Build/build?delay=0sec FYI On Mon, Jan 11, 2016 at 9:01 AM, Steve Loughran wrote: > > I submitted some PR-based patches last week —they haven't been tested yet > > https://issues.apache.or

Re: [VOTE] Release Apache Hadoop 2.6.4 RC0

2016-02-03 Thread Ted Yu
I modified hbase pom.xml (0.98 branch) to point to staged maven artifacts. All unit tests passed. Cheers On Tue, Feb 2, 2016 at 11:01 PM, Junping Du wrote: > Hi community folks, >I've created a release candidate RC0 for Apache Hadoop 2.6.4 (the next > maintenance release to follow up 2.6.3

Re: Jira Lock Down Upgraded?

2016-05-12 Thread Ted Yu
Looks like side effects of this lock down are: 1. person (non-admin) who logged JIRA cannot comment on the JIRA 2. result of QA run cannot be posted onto JIRA (at least for hbase tests) :-( On Thu, May 12, 2016 at 3:10 PM, Andrew Wang wrote: > Try asking on infra.chat (Apache INFRA's hipchat).

Re: Different JIRA permissions for HADOOP and HDFS

2016-05-14 Thread Ted Yu
Looks like you attached some images which didn't go through. Consider using 3rd party image site. Cheers On Sat, May 14, 2016 at 7:07 AM, Zheng, Kai wrote: > Hi, > > > > Noticed this difference but not sure if it’s intended. YARN is similar > with HDFS. It’s not convenient. Any clarifying? Tha

Re: [VOTE] Release Apache Hadoop 2.0.4-alpha

2013-04-09 Thread Ted Yu
t;> http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.0.4-alpha-rc0 >> >> The maven artifacts are available via repository.apache.org. >> >> Please try the release and vote; the vote will run for the usual 7 days. >> >> thanks, >> Arun >

Re: test patch fails with -1 findbugs

2013-05-11 Thread Ted Yu
As validation step, you can run an empty patch (adding whitespace in any file) through test-patch. You would have base number of findbugs warnings. Cheers On Sat, May 11, 2013 at 12:24 PM, Amit Sela wrote: > Hi all, > > I've recently added a patch to branch-1 and I wanted to run test-patch. > I

Re: test patch fails with -1 findbugs

2013-05-11 Thread Ted Yu
oop-working-branch/hadoop-common-1/build/test/findbugs/hadoop-findbugs-report.html > [exec] [xslt] Loading stylesheet > /usr/local/findbugs-1.3.9/src/xsl/default.xsl > [exec] [findbugs] Java Result: 3 > > > > On Sat, May 11, 2013 at 11:44 PM, Ted Yu wrote: >

Re: [ANNOUNCE] New Hadoop Committers

2013-05-28 Thread Ted Yu
Congratulations. On Tue, May 28, 2013 at 3:07 PM, Aaron T. Myers wrote: > On behalf of the Apache Hadoop PMC, I'd like to announce the addition of a > few new committers to the Apache Hadoop project: > > * Brandon Li > * Chris Nauroth > * Colin Patrick McCabe > * Ivan Mitic > * Jing Zhao > > We

Re: Heads up: moving from 2.0.4.1-alpha to 2.0.5-alpha

2013-05-31 Thread Ted Yu
I am currently testing HBase 0.95 using 2.0.5-SNAPSHOT artifacts. Would 2.1.0-SNAPSHOT maven artifacts be available after tomorrow's change ? Thanks On Fri, May 31, 2013 at 12:45 PM, Konstantin Boudnik wrote: > Guys, > > I will be performing some changes wrt to moving 2.0.4.1 release candidate

Re: issues.apache.org down?

2013-06-11 Thread Ted Yu
I experience the same problem. I think infrastruct...@apache.org should be notified. Cheers On Tue, Jun 11, 2013 at 11:05 AM, Sangjin Lee wrote: > Is it just me or is issues.apache.org down? From my network, traceroute is > unable to get to issues.apache.org. What is the way to report issues w

Re: a question

2013-07-20 Thread Ted Yu
Did the AdaptiveScheduler come from MAPREDUCE-1380 ? Thanks On Fri, Jul 19, 2013 at 11:23 PM, Hamedreza Berenjian < hamedreza_berenj...@yahoo.com> wrote: > Hi, > I have a jar file from a hadoop scheduler(AdaptiveScheduler=resource aware > slotless adaptive scheduler).I'm sure that this jar file

Re: a question

2013-07-20 Thread Ted Yu
ached the jar file. how I can resolve this problem?please help me... > >*From:* Ted Yu > *To:* common-dev@hadoop.apache.org; Hamedreza Berenjian < > hamedreza_berenj...@yahoo.com> > *Sent:* Saturday, 20 July 2013, 6:52:16 > *Subject:* Re: a question > > Did the Adaptiv

Re: Building Hadoop...

2013-07-28 Thread Ted Yu
You should be using libprotoc 2.4.1 Cheers On Sun, Jul 28, 2013 at 7:08 AM, James Carman wrote: > Is there anything special I have to do to get the build working on my > local machine? I have installed protocol buffers and I of course have > Maven/JDK. I am getting compiler errors relating to

Re: Building Hadoop...

2013-07-28 Thread Ted Yu
doop/ipc/protobuf/RpcHeaderProtos.java:[1497,30] > cannot find symbol > > Is this because protobuf is generating source code using a newer > version and some of the classes aren't there? > > > On Sun, Jul 28, 2013 at 10:15 AM, Ted Yu wrote: >> You should be using li

Re: Building Hadoop...

2013-07-28 Thread Ted Yu
, James Carman >wrote: > > > Okay, cool. That's what I figured. I'll try to figure out how to > > install specific versions using homebrew and move on down the road. > > Thanks! > > > > On Sun, Jul 28, 2013 at 4:22 PM, Ted Yu wrote: > > >

Re: Building Hadoop...

2013-07-29 Thread Ted Yu
official" source repository Git or SVN for this project? > >> > >> On Sun, Jul 28, 2013 at 11:22 PM, Ted Yu wrote: > >>> Thanks for sharing, Chris. > >>> > >>> The following command would produce tar ball, skipping javadoc: > >>

Re: Building Hadoop...

2013-07-29 Thread Ted Yu
. Any suggestions on good areas > to start contributing? > > On Mon, Jul 29, 2013 at 11:47 AM, Ted Yu wrote: > > To my knowledge, attaching patch on the JIRA is the standard way of > > contributing to Hadoop. > > > > BTW there is a small lag between checkin of SVN and t

Re: Java 7 and Hadoop

2013-08-02 Thread Ted Yu
For HBase, under https://builds.apache.org/job/HBase-TRUNK/configure, there is an entry labeled JDK where you can select 'JDK 1.7 (latest)' I guess it would be similar for hadoop builds. Cheers On Fri, Aug 2, 2013 at 10:55 AM, Sandy Ryza wrote: > How do we go about setting up builds on jdk7?

Re: Please unsubscribe me

2013-08-14 Thread Ted Yu
Send an email to common-dev-unsubscr...@hadoop.apache.org Cheers On Wed, Aug 14, 2013 at 6:06 PM, Yuta Morinaga wrote: > Thanks! > > >

Re: unit testing and execution guide

2013-09-14 Thread Ted Yu
Have you read 'Making Changes' section in http://wiki.apache.org/hadoop/HowToContribute ? On Fri, Sep 13, 2013 at 9:26 PM, Hai Huang wrote: > Hi All, > > Are there latest instructions of unit testing and how to run hadoop in > somewhere? Although there are some unit testing and hadoop run do

Re: generated code in hadoop

2013-10-17 Thread Ted Yu
You can checkout trunk code. See SVN Access section in: http://wiki.apache.org/hadoop/HowToContribute After building hadoop, you will find generated code. Cheers On Oct 17, 2013, at 5:08 AM, Jonathan Bernwieser wrote: > Hi there, > > I am currently doing my Bachelor thesis at TU Munich, at

Re: would like to contribute

2013-11-03 Thread Ted Yu
Omar: See this thread also: http://search-hadoop.com/m/aHlAs14Jq7v1 On Sun, Nov 3, 2013 at 8:33 AM, Stas Maksimov wrote: > Hi Omar, > > I don't expect anyone will guide you here, everything is very > straightforward (and googleable by the way): > http://wiki.apache.org/hadoop/HowToContribute >

scope of jersey-test-framework-grizzly2

2013-11-12 Thread Ted Yu
Hi, To answer some question on dev@hbase, I noticed the following dependencies: [INFO] +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.2.0:compile [INFO] | +- org.apache.hadoop:hadoop-yarn-common:jar:2.2.0:compile [INFO] | | +- com.google.inject:guice:jar:3.0:compile [INFO] | | | +-

Re: scope of jersey-test-framework-grizzly2

2013-11-13 Thread Ted Yu
OP-9991 > > > On 13 November 2013 05:09, Ted Yu wrote: > > > Hi, > > To answer some question on dev@hbase, I noticed the following > > dependencies: > > > > [INFO] +- > org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.2.0:compile > > [INFO] |

<    1   2   3   4   5   6   7   8   9   10   >