Hadoop user event in Europe (interested ? )

2011-01-20 Thread Asif Jan

Hi

wondering if there is interest to organize a hadoop meet-up in Europe  
( Geneva, Switzerland) ? It could be a 2 day event discussing use of  
Hadoop in industry/science project.


If this interest you please let me know.

cheers
asif












Re: another quick question

2010-10-06 Thread Asif Jan

Hi

The tmp directory is local to the machine running the hadoop system,  
so if your hadoop is on a remote machine, tmp directory has to be on  
that machine


Your question is not clear to me e.g. what you want to do?

asif


On Oct 6, 2010, at 9:55 PM, Maha A. Alabduljalil wrote:


Hi again,

 I guess my questions are easy..

Since I'm installing hadoop in my school machine I have to veiw  
namenode online via hdfs://host-name:50070 instead of the default  
link provided by Hadoop Quick Start. (ie.hdfs://localhost:50070).


 Do you think I should set my hadoop.tmp.dir to the machine I'm  
currenlty working  on and I can do the default way?



Thank you,

 Maha













Re: Quick question

2010-10-06 Thread Asif Jan

Hi

check if the ports are open outside school network else

you will have to use ssh tunneling if you want to access ports serving  
the webpages (as it is more likely that these are not open by default)


try something like

ssh -L50030:hadoop-host-address:50030 ur-usern...@cluster-head-node


Then open  localhost:50030 to see job-tracker page.



cheers



On Oct 6, 2010, at 9:14 PM, Maha A. Alabduljalil wrote:


Hi Every one,

 I've started up hadoop (hdfs data and name nodes, JobTracker and  
TaskTrakers), using the quick start guidance. The web view of the  
filesystem and jobtracker suddenly started to give can't be found by  
safari.


Notice I'm actually accessing hadoop via ssh to my school account.  
Could that be the problem?


Thank you,

 Maha














mapside joins

2010-07-21 Thread Asif Jan




Hi

Does join only work with text like files ?

Is it possible to do map-side join using custom Writable. Say I have
writables, custom1 and custom2 and there is one common field (say id)
that could join the records in these 2 objects.

So will it be possible to join these 2 files and output writable
custom3 ?

thanks 


Re: manipulate counters in new api

2010-07-20 Thread Asif Jan

context.getCounter(Status.failed).increment(1); 


cheers




On Jul 19, 2010, at 9:48 PM, Gang Luo wrote:


Hi all,
I find the map/reduce method in the new api look like map/ 
reduce(Object,
Iterable, Context). No Reporter appears here as in the old api. How  
to add and

modify counters in the new api?

Thanks,
-Gang

















Re: Single Node with multiple mappers?

2010-07-16 Thread Asif Jan

how is your data being spilt ?
 using   mapred.map.tasks property should let you specify how many  
maps you would want to run (provided your input file is big enough to  
be spilt into multiple chunks)


asif

On Jul 16, 2010, at 11:03 AM, Moritz Krog wrote:


Hi everyone,

I was curious if there is any option to use Hadoop in single node mode
in a way, that enables the process to use more system ressources.
Right now, Hadoop uses one mapper and one reducer, leaving my i7 with
about 20% CPU usage (1 core for Hadoop, .5 cores for my OS) basically
idling.
Raising the number of map tasks doesn't seem to do much, as this
parameter seems to more of a hint anyway. Still, I have lots of CPU
time and RAM left. Any hints on how to use them?

thanks in advance,
Moritz










Re: how to do a reduce-only job

2010-07-16 Thread Asif Jan
you need to join these files into 1; you could ether do a map-side  
join, or reduce-side join


for map-side join (slightly more involved)  look at example:

org.apache.hadoop.examples.Join

for reduce side join simply create 2 mappers (one for each file) and  
one reduce (as long as you keep key-value for both same)

You will have to use mutliple input format for doing so.

e.g.
MultipleInputs.addInputPath(conf, path1, input_format1, mapper_class1)
MultipleInputs.addInputPath(conf, path2, input_format2, mapper_class2)

The javadoc of the class explains it further.

cheers





On Jul 15, 2010, at 10:26 PM, David Hawthorne wrote:


I have two previously created output files of format:

key[tab]value

where key is text, value is an integer sum of how many times the key  
appeared.


I would like to reduce these output files together into one new  
output file.  I'm having problems finding out how to do this.


I've found ways to specify a job with no reducers, but it doesn't  
look like there's a way to specify a reduce-only job, aside from  
using the streaming interface with 'cat' as the mapper.  I'm not  
opposed to this, but I also couldn't find a way to specify 'cat' as  
a mapper and the reducer in my java class as the reducer.  I'm also  
not sure this would work, as the reducer might simply see the entire  
line emitted by cat as the key.  I could use awk as the reducer, but  
I've heard that streaming is less performant than java, and I've  
already got the java class written. I could write another java class  
with a mapper that splits in the value on tab and emits the two  
fields as , but that seems like it would be extra work  
and less optimal than being able to run a reduce-only job.


So... what are the options?  Is there a way to specify a reduce-only  
job?













Error compiling mapreduce

2010-06-16 Thread Asif Jan


Hi

I am getting this strange error in the target "compile-mapred- 
classes:"  when compiling map-reduce on Mac OS X. The JAVA_HOME is  
properly set , and if I remove  from compile-mapred- 
classes everything works fine. Is anything special is needed in order  
to get jasper to work ?


thanks

[jsp-compile] java.lang.IllegalStateException: No Java compiler  
available
[jsp-compile] 	at  
org 
.apache 
.jasper 
.JspCompilationContext.createCompiler(JspCompilationContext.java:224)

[jsp-compile]   at org.apache.jasper.JspC.processFile(JspC.java:946)
[jsp-compile]   at org.apache.jasper.JspC.execute(JspC.java:1094)
[jsp-compile] 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native  
Method)
[jsp-compile] 	at  
sun 
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 
39)
[jsp-compile] 	at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

[jsp-compile]   at java.lang.reflect.Method.invoke(Method.java:597)
[jsp-compile] 	at  
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java: 
105)
[jsp-compile] 	at  
org.apache.tools.ant.TaskAdapter.execute(TaskAdapter.java:134)
[jsp-compile] 	at  
org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
[jsp-compile] 	at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown  
Source)
[jsp-compile] 	at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)

[jsp-compile]   at java.lang.reflect.Method.invoke(Method.java:597)
[jsp-compile] 	at  
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java: 
105)

[jsp-compile]   at org.apache.tools.ant.Task.perform(Task.java:348)
[jsp-compile]   at org.apache.tools.ant.Target.execute(Target.java:357)
[jsp-compile] 	at org.apache.tools.ant.Target.performTasks(Target.java: 
385)
[jsp-compile] 	at  
org.apache.tools.ant.Project.executeSortedTargets(Project.java:1329)
[jsp-compile] 	at  
org.apache.tools.ant.Project.executeTarget(Project.java:1298)
[jsp-compile] 	at  
org 
.apache 
.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java: 
41)
[jsp-compile] 	at  
org 
.eclipse 
.ant 
.internal 
.ui 
.antsupport 
.EclipseDefaultExecutor.executeTargets(EclipseDefaultExecutor.java:32)
[jsp-compile] 	at  
org.apache.tools.ant.Project.executeTargets(Project.java:1181)
[jsp-compile] 	at  
org 
.eclipse 
.ant 
.internal.ui.antsupport.InternalAntRunner.run(InternalAntRunner.java: 
423)
[jsp-compile] 	at  
org 
.eclipse 
.ant 
.internal.ui.antsupport.InternalAntRunner.main(InternalAntRunner.java: 
137)


BUILD FAILED
/Users/asifjan/Documents/eclipse_workspaces/gaia/hadoop-mapreduce- 
trunk/build.xml:373: org.apache.jasper.JasperException:  
java.lang.IllegalStateException: No Java compiler available





exception related to logging (using latest sources)

2010-06-15 Thread Asif Jan


Hi

I am getting following exception when running map-reduce jobs.


java.lang.NullPointerException
	at  
org.apache.hadoop.mapred.TaskLogAppender.flush(TaskLogAppender.java:69)

at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:222)
at org.apache.hadoop.mapred.Child$4.run(Child.java:219)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
	at  
org 
.apache 
.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java: 
813)

at org.apache.hadoop.mapred.Child.main(Child.java:211)


I am using latest sources (0.22.0-snapshot) that I have built myself.

any ideas?

thanks




How to use MapFile in mapreduce

2010-06-15 Thread Asif Jan

Hi

any pointers on how to use the MapFile with new mapreduce API.

I did find the correspondinf output format e.g.  
org.apache.hadoop.mapreduce.lib.output.MapFileOutputFormat, but was  
not able to see how I can specify MapFileInputFormat ?  (naively I  
thought that  
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;

 should work for MapFile as well)

will I have to implement RecordReader in order to read from a MapFile ?

Thanks











Re: calling C programs from Hadoop

2010-05-29 Thread Asif Jan

Look at Hadoop streaming, may be it is helpful to you.

asif
On May 29, 2010, at 8:31 PM, Michael Robinson wrote:



I am new to Hadoop. I have successfully run java programs from  
Hadoop and I

would like to call C programs from Hadoop.

Thank you for your help

Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p854833.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.



Asif Jan
Gaia Project
SixSq Sarl / ISDC Astrophysics Data Centre & Geneva Observatory 
Chemin  des Ecogia 16   

CH-1290 Versoix
Switzerland

E-mail  : asif@unige.ch
Tel.: +41 22 37 92198
Fax : +41 22 37 92133







How to make latest build work ?

2010-04-19 Thread Asif Jan


Hi

I need to build Hadoop installation from the latest source code of  
hadoop/common; I checked out the latest source and ran ant target that  
makes a distribution tar  (ant tar)


when I try to run the system I get error HDFS not found.

any idea how I can have a functional system from latest sources

thanks