Hadoop Eclipse plugin doesn't show a dialog for New Hadoop Locations ... on MacOS

2011-07-07 Thread Teruhiko Kurosaka
Hi,
I'm new to Hadoop.  I'm trying to set up Eclipse for Hadoop debugging.
I have:
Eclipse 3.5.2, configured to run apps with JRE 1.5.
MacOS 10.6.8
Hadoop 0.21.0

I copied mapred/contrib/eclipse-plugin/hadoop-0.21.0-eclipse-plugin.jar
to /Applications/eclipse/plugins
and restarted Eclipse.
I switched to Map/Reduce perspective and I see Map/Reduce Locations tab
next to Problems, Tasks and Javadoc tabs.
I switched to the Map/Reduce Locations tab and right clicked within
the pane and choose "New Hadoop location..." but nothing happens.
A dialog window is supposed to pop up but nothing.

Is there any known issues? How do I trace this problem?


T. "Kuro" Kurosaka




Re: Hadoop Eclipse plugin doesn't show a dialog for New Hadoop Locations ... on MacOS

2011-07-07 Thread Teruhiko Kurosaka
Thanks, but selection of JRE 1.6 didn't help.


On 7/7/11 8:54 PM, "Pandu Pradhana"  wrote:

>Hi, 
>
>Maybe related the Java version you are using. Try to use Java 1.6
>
>Regards,
>--Pandu
>
>On Jul 7, 2011, at 4:03 PM, Teruhiko Kurosaka wrote:
>
>> Hi,
>> I'm new to Hadoop.  I'm trying to set up Eclipse for Hadoop debugging.
>> I have:
>> Eclipse 3.5.2, configured to run apps with JRE 1.5.
>> MacOS 10.6.8
>> Hadoop 0.21.0
>> 
>> I copied mapred/contrib/eclipse-plugin/hadoop-0.21.0-eclipse-plugin.jar
>> to /Applications/eclipse/plugins
>> and restarted Eclipse.
>> I switched to Map/Reduce perspective and I see Map/Reduce Locations tab
>> next to Problems, Tasks and Javadoc tabs.
>> I switched to the Map/Reduce Locations tab and right clicked within
>> the pane and choose "New Hadoop location..." but nothing happens.
>> A dialog window is supposed to pop up but nothing.
>> 
>> Is there any known issues? How do I trace this problem?
>> 
>> 
>> T. "Kuro" Kurosaka
>> 
>> 
>



Re: Hadoop Eclipse plugin doesn't show a dialog for New Hadoop Locations ... on MacOS

2011-07-07 Thread Teruhiko Kurosaka
I've found an Exception in the Eclipse log.

Eclipse complains that it can't find org.apache.hadoop.conf.Configuration.
But I see it is in lib/hadoop-common.jar in the plug-in.jar,
mapred/contrib/eclipse-plugin/hadoop-0.21.0-eclipse-plugin.jar.

Closer look at it shows that META-INF/MANIFEST.MF has a wrong entry:
Bundle-ClassPath: classes/,lib/hadoop-core.jar

Notice that lib/hadoop-core.jar is mentioned instead of
lib/hadoop-common.jar.




Unhandled event loop exception

java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at 
org.apache.hadoop.eclipse.server.HadoopServer.(HadoopServer.java:223)
at 
org.apache.hadoop.eclipse.servers.HadoopLocationWizard.(HadoopLocatio
nWizard.java:88)
at 
org.apache.hadoop.eclipse.actions.NewLocationAction$1.(NewLocationAct
ion.java:41)
at 
org.apache.hadoop.eclipse.actions.NewLocationAction.run(NewLocationAction.j
ava:40)
at org.eclipse.jface.action.Action.runWithEvent(Action.java:498)
at 
org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(Actio
nContributionItem.java:584)
at 
org.eclipse.jface.action.ActionContributionItem.access$2(ActionContribution
Item.java:501)
at 
org.eclipse.jface.action.ActionContributionItem$6.handleEvent(ActionContrib
utionItem.java:452)
at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:84)
at org.eclipse.swt.widgets.Display.sendEvent(Display.java:3543)
at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1250)
at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1273)
at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1258)
at org.eclipse.swt.widgets.Widget.notifyListeners(Widget.java:1079)
at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:3441)
at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3100)
at org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2405)
at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2369)
at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:2221)
at org.eclipse.ui.internal.Workbench$5.run(Workbench.java:500)
at 
org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:332
)
at 
org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:493)
at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)
at 
org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication
.java:113)
at 
org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java
:194)
at 
org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication
(EclipseAppLauncher.java:110)
at 
org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseA
ppLauncher.java:79)
at 
org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:368
)
at 
org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179
)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:3
9)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp
l.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:559)
at org.eclipse.equinox.launcher.Main.basicRun(Main.java:514)
at org.eclipse.equinox.launcher.Main.run(Main.java:1311)





On 7/8/11 7:24 AM, "Teruhiko Kurosaka"  wrote:

>Thanks, but selection of JRE 1.6 didn't help.
>
>
>On 7/7/11 8:54 PM, "Pandu Pradhana"  wrote:
>
>>Hi, 
>>
>>Maybe related the Java version you are using. Try to use Java 1.6
>>
>>Regards,
>>--Pandu
>>
>>On Jul 7, 2011, at 4:03 PM, Teruhiko Kurosaka wrote:
>>
>>> Hi,
>>> I'm new to Hadoop.  I'm trying to set up Eclipse for Hadoop debugging.
>>> I have:
>>> Eclipse 3.5.2, configured to run apps with JRE 1.5.
>>> MacOS 10.6.8
>>> Hadoop 0.21.0
>>> 
>>> I copied mapred/contrib/eclipse-plugin/hadoop-0.21.0-eclipse-plugin.jar
>>> to /Applications/eclipse/plugins
>>> and restarted Eclipse.
>>> I switched to Map/Reduce perspective and I see Map/Reduce Locations tab
>>> next to Problems, Tasks and Javadoc tabs.
>>> I switched to the Map/Reduce Locations tab and right clicked within
>>> the pane and choose "New Hadoop location..." but nothing happens.
>>> A dialog window is supposed to pop up but nothing.
>>> 
>>> Is there any known issues? How do I trace this problem?
>>> 
>>> 
>>> T. "Kuro" Kurosaka
>>> 
>>> 
>>
>



Which release to use?

2011-07-14 Thread Teruhiko Kurosaka
I'm a newbie and I am confused by the Hadoop releases.
I thought 0.21.0 is the latest & greatest release that I
should be using but I noticed 0.20.203 has been released
lately, and 0.21.X is marked "unstable, unsupported".

Should I be using 0.20.203?

T. "Kuro" Kurosaka




How big data and/or how many machines do I need to take advantage of Hadoop?

2011-08-31 Thread Teruhiko Kurosaka
Hadoop newbie here.

I wrapped my company's entity extraction product in a Hadoop task,
and give it a large file of the magnitude of 100MB.
I have 4 VMs running on a 24-core CPU server, and made two of
them the slave nodes, one namenode and another job tracker.
It turned out that processing the same data size takes longer
using Hadoop than processing it in serial.

I am curious that how I can experience the advantage of
Hadoop.  Is having many physical machines essential?
Would I need to process Terabytes of data? What would be
the minimum set up where I can experience the advantage
of Hadoop?

T. "Kuro" Kurosaka



Re: How big data and/or how many machines do I need to take advantage of Hadoop?

2011-08-31 Thread Teruhiko Kurosaka
Brian,
This particular task takes time in computation, in the order of minutes.

T. "Kuro" Kurosaka
From: Brian Bockelman mailto:bbock...@cse.unl.edu>>
Reply-To: "common-user@hadoop.apache.org<mailto:common-user@hadoop.apache.org>" 
mailto:common-user@hadoop.apache.org>>
Date: Wed, 31 Aug 2011 08:04:07 -0400
To: "common-user@hadoop.apache.org<mailto:common-user@hadoop.apache.org>" 
mailto:common-user@hadoop.apache.org>>
Subject: Re: How big data and/or how many machines do I need to take advantage 
of Hadoop?

Hi Kuro,

A 100MB file should take 1 second to read; typically, MR jobs get scheduled on 
the order of seconds.  So, it's unlikely you'll see any benefit.

You'll probably want to have a look at Amdahl's law:

http://en.wikipedia.org/wiki/Amdahl%27s_law<http://en.wikipedia.org/wiki/Amdahl's_law>

Brian

On Aug 31, 2011, at 3:48 AM, Teruhiko Kurosaka wrote:

Hadoop newbie here.

I wrapped my company's entity extraction product in a Hadoop task,
and give it a large file of the magnitude of 100MB.
I have 4 VMs running on a 24-core CPU server, and made two of
them the slave nodes, one namenode and another job tracker.
It turned out that processing the same data size takes longer
using Hadoop than processing it in serial.

I am curious that how I can experience the advantage of
Hadoop.  Is having many physical machines essential?
Would I need to process Terabytes of data? What would be
the minimum set up where I can experience the advantage
of Hadoop?

T. "Kuro" Kurosaka