Re: Hadoop and Matlab

2009-04-23 Thread nitesh bhatia
Hi
The simplest way for you to run Matlab would be to use distributed toolkit
provided in matlab. You just need to configure matlab to discover other
matlab-machines. In that way you will not require to setup a hadoop cluster.
However if you want to use hadoop as a backend framework for distributed
processing, I would suggest you to go for Octave which is open source
toolkit just like matlab. It provides interfaces for c/c++. I think that
would be more easy to configure it with hadoop than going for matlab which
is not open source and licenced.

--nitesh


On Wed, Apr 22, 2009 at 7:10 AM, Edward J. Yoon wrote:

> Hi,
> Where to store the images? How to retrieval the images?
>
> If you have a metadata for the images, the map task can receives a
> 'filename' of image as a key, and file properies (host, file path,
> ..,etc) as its value. Then, I guess you can handle the matlab process
> using runtime object on hadoop cluster.
>
> On Wed, Apr 22, 2009 at 9:30 AM, Sameer Tilak 
> wrote:
> > Hi Edward,
> > Yes, we're building this for handling hundreds of thousands images (at
> > least). We're thinking processing of individual images (or a set of
> images
> > together) will be done in Matlab itself. However, we can use Hadoop
> > framework to process the data in parallel fashion. One Matlab instance
> > handling few hundred images (as a mapper) and have hundreds of such
> > instances and then combine (reducer) the o/p of each instance.
> >
> > On Tue, Apr 21, 2009 at 5:06 PM, Edward J. Yoon  >wrote:
> >
> >> Hi, What is the input data?
> >>
> >> According to my understanding, you have a lot of images and want to
> >> process all images using your matlab script. Then, You should write
> >> some code yourself. I did similar thing for plotting graph with
> >> gnuplot. However, If you want to do large-scale linear algebra
> >> operations for large image processing, I would recommend investigating
> >> other solutions. Hadoop is not a general purpose clustering software,
> >> and it cannot run matlab.
> >>
> >> On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak 
> >> wrote:
> >> > Hi there,
> >> >
> >> > We're working on an image analysis project. The image processing code
> is
> >> > written in Matlab. If I invoke that code from a shell script and then
> use
> >> > that shell script within Hadoop streaming, will that work? Has anyone
> >> done
> >> > something along these lines?
> >> >
> >> > Many thaks,
> >> > --ST.
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> edwardy...@apache.org
> >> http://blog.udanax.org
> >>
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> edwardy...@apache.org
> http://blog.udanax.org
>



-- 
Nitesh Bhatia
Dhirubhai Ambani Institute of Information & Communication Technology
Gandhinagar
Gujarat

"Life is never perfect. It just depends where you draw the line."

visit:
http://www.awaaaz.com - connecting through music
http://www.volstreet.com - lets volunteer for better tomorrow
http://www.instibuzz.com - Voice opinions, Transact easily, Have fun


Re: Hadoop and Matlab

2009-04-21 Thread Edward J. Yoon
Hi,
Where to store the images? How to retrieval the images?

If you have a metadata for the images, the map task can receives a
'filename' of image as a key, and file properies (host, file path,
..,etc) as its value. Then, I guess you can handle the matlab process
using runtime object on hadoop cluster.

On Wed, Apr 22, 2009 at 9:30 AM, Sameer Tilak  wrote:
> Hi Edward,
> Yes, we're building this for handling hundreds of thousands images (at
> least). We're thinking processing of individual images (or a set of images
> together) will be done in Matlab itself. However, we can use Hadoop
> framework to process the data in parallel fashion. One Matlab instance
> handling few hundred images (as a mapper) and have hundreds of such
> instances and then combine (reducer) the o/p of each instance.
>
> On Tue, Apr 21, 2009 at 5:06 PM, Edward J. Yoon wrote:
>
>> Hi, What is the input data?
>>
>> According to my understanding, you have a lot of images and want to
>> process all images using your matlab script. Then, You should write
>> some code yourself. I did similar thing for plotting graph with
>> gnuplot. However, If you want to do large-scale linear algebra
>> operations for large image processing, I would recommend investigating
>> other solutions. Hadoop is not a general purpose clustering software,
>> and it cannot run matlab.
>>
>> On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak 
>> wrote:
>> > Hi there,
>> >
>> > We're working on an image analysis project. The image processing code is
>> > written in Matlab. If I invoke that code from a shell script and then use
>> > that shell script within Hadoop streaming, will that work? Has anyone
>> done
>> > something along these lines?
>> >
>> > Many thaks,
>> > --ST.
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> edwardy...@apache.org
>> http://blog.udanax.org
>>
>



-- 
Best Regards, Edward J. Yoon
edwardy...@apache.org
http://blog.udanax.org


Re: Hadoop and Matlab

2009-04-21 Thread Sameer Tilak
Hi Edward,
Yes, we're building this for handling hundreds of thousands images (at
least). We're thinking processing of individual images (or a set of images
together) will be done in Matlab itself. However, we can use Hadoop
framework to process the data in parallel fashion. One Matlab instance
handling few hundred images (as a mapper) and have hundreds of such
instances and then combine (reducer) the o/p of each instance.

On Tue, Apr 21, 2009 at 5:06 PM, Edward J. Yoon wrote:

> Hi, What is the input data?
>
> According to my understanding, you have a lot of images and want to
> process all images using your matlab script. Then, You should write
> some code yourself. I did similar thing for plotting graph with
> gnuplot. However, If you want to do large-scale linear algebra
> operations for large image processing, I would recommend investigating
> other solutions. Hadoop is not a general purpose clustering software,
> and it cannot run matlab.
>
> On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak 
> wrote:
> > Hi there,
> >
> > We're working on an image analysis project. The image processing code is
> > written in Matlab. If I invoke that code from a shell script and then use
> > that shell script within Hadoop streaming, will that work? Has anyone
> done
> > something along these lines?
> >
> > Many thaks,
> > --ST.
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> edwardy...@apache.org
> http://blog.udanax.org
>


Re: Hadoop and Matlab

2009-04-21 Thread Edward J. Yoon
Hi, What is the input data?

According to my understanding, you have a lot of images and want to
process all images using your matlab script. Then, You should write
some code yourself. I did similar thing for plotting graph with
gnuplot. However, If you want to do large-scale linear algebra
operations for large image processing, I would recommend investigating
other solutions. Hadoop is not a general purpose clustering software,
and it cannot run matlab.

On Wed, Apr 22, 2009 at 2:55 AM, Sameer Tilak  wrote:
> Hi there,
>
> We're working on an image analysis project. The image processing code is
> written in Matlab. If I invoke that code from a shell script and then use
> that shell script within Hadoop streaming, will that work? Has anyone done
> something along these lines?
>
> Many thaks,
> --ST.
>



-- 
Best Regards, Edward J. Yoon
edwardy...@apache.org
http://blog.udanax.org


Re: Hadoop and Matlab

2009-04-21 Thread Peter Skomoroch
If you can compile the matlab code to an executable with the matlab  
compiler and send it to the nodes with the distributed cache that  
should work... You probably want to avoid licensing fees for running  
copies of matlab itself on the cluster.


Sent from my iPhone

On Apr 21, 2009, at 1:55 PM, Sameer Tilak  wrote:


Hi there,

We're working on an image analysis project. The image processing  
code is
written in Matlab. If I invoke that code from a shell script and  
then use
that shell script within Hadoop streaming, will that work? Has  
anyone done

something along these lines?

Many thaks,
--ST.


RE: Hadoop and Matlab

2009-04-21 Thread Patterson, Josh
Sameer,
I'd also be interested in that as well; We are constructing a hadoop
cluster for energy data (PMU) for the NERC and we will be potentially
running jobs for a number of groups and researchers. I know some
researchers will know nothing of map reduce, yet are very keen on
MatLab, so we're looking at ways to make that transition as smooth as
possible. 

Josh Patterson
TVA

-Original Message-
From: Sameer Tilak [mailto:sameer.u...@gmail.com] 
Sent: Tuesday, April 21, 2009 1:56 PM
To: core-user@hadoop.apache.org
Subject: Hadoop and Matlab

Hi there,

We're working on an image analysis project. The image processing code is
written in Matlab. If I invoke that code from a shell script and then
use
that shell script within Hadoop streaming, will that work? Has anyone
done
something along these lines?

Many thaks,
--ST.


Re: Hadoop and Matlab

2008-12-12 Thread Edward J. Yoon
Just FYI, See hama (http://incubator.apache.org/hama/)

We are working on parallel math (from scalapack/matlab people we got a
positive answer) project using hadoop.

On Sat, Dec 13, 2008 at 12:39 PM, Dmitry Pushkarev  wrote:
> Hi.
>
>
>
> Can anyone share experience of successfully parallelizing matlab tasks using
> hadoop?
>
>
>
> We have implemented this thing with python (in form of simple module that
> takes serialized function and data array and runs this function on the
> cluster)m but we really have no clue how to that in Matlab.
>
>
>
> Ideally we want to use Matlab in the same way -  write .m file that takes
> set of parameters and returns some value, specify list of input parameters
> (like lists of variable to try for Gaussian kernels) and run it on the
> cluster, in somewhat failproof manner - that's the ideal situation.
>
>
>
> Has anyone tried that?
>
>
>
> ---
>
> Dmitry
>
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardy...@apache.org
http://blog.udanax.org