Encoding image into Jpeg2000 using Hadoop

2010-11-28 Thread PORTO aLET
Just wondering if anybody has done/aware about encoding/compressing large
image into JPEG2000 format using Hadoop ?

We have 1TB+ raw images that need to be compressed in JPEG2000 and other
format. Using one beefy machine the rate of compression is about 2GB/hour,
so it takes about >500hours to compress one image.

There is also this http://code.google.com/p/matsu-project/ which uses map
reduce to process the image (
http://www.cloudera.com/videos/hw10_video_hadoop_image_processing_for_disaster_relief)


Re: delay the execution of reducers

2010-11-28 Thread li ping
org.apache.hadoop.mapred.JobInProgress

Maybe you find this class.

On Mon, Nov 29, 2010 at 4:36 AM, Da Zheng  wrote:

> I have a problem with subscribing mapreduce mailing list.
>
> I use hadoop-0.20.2. I have added this parameter to mapred-site.xml. Is
> there any way for me to check whether the parameter has been read and
> activated?
>
> BTW, what do you mean by opening a jira?
>
> Thanks,
> Da
>
>
> On 11/28/2010 05:03 AM, Arun C Murthy wrote:
>
>> Moving to mapreduce-user@, bcc common-u...@. Please use project
>> specific lists.
>>
>> mapreduce.reduce.slowstart.completed.maps is the right knob. Which version
>> of hadoop are you running? If it isn't working, please open a jira. Thanks.
>>
>> Arun
>>
>> On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
>>
>>  Hello,
>>>
>>> I found in Hadoop that reducers starts when a fraction of the number of
>>> mappers
>>> is complete. However, in my case, I hope reducers to start only when all
>>> mappers
>>> are complete. I searched for Hadoop configuration parameters, and found
>>> mapred.reduce.slowstart.completed.maps, which seems to do what I want.
>>> But no
>>> matter what value (0.99, 1.00, etc) I set to
>>> mapred.reduce.slowstart.completed.maps, reducers always start to execute
>>> when
>>> about 10% of mappers are complete.
>>>
>>> Do I set the right parameter? Is there any other parameter I can use for
>>> this
>>> purpose?
>>>
>>> Thanks,
>>> Da
>>>
>>
>>
>


-- 
-李平


Re: delay the execution of reducers

2010-11-28 Thread Da Zheng

I have a problem with subscribing mapreduce mailing list.

I use hadoop-0.20.2. I have added this parameter to mapred-site.xml. Is 
there any way for me to check whether the parameter has been read and 
activated?


BTW, what do you mean by opening a jira?

Thanks,
Da

On 11/28/2010 05:03 AM, Arun C Murthy wrote:

Moving to mapreduce-user@, bcc common-u...@. Please use project
specific lists.

mapreduce.reduce.slowstart.completed.maps is the right knob. Which 
version of hadoop are you running? If it isn't working, please open a 
jira. Thanks.


Arun

On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:


Hello,

I found in Hadoop that reducers starts when a fraction of the number 
of mappers
is complete. However, in my case, I hope reducers to start only when 
all mappers

are complete. I searched for Hadoop configuration parameters, and found
mapred.reduce.slowstart.completed.maps, which seems to do what I 
want. But no

matter what value (0.99, 1.00, etc) I set to
mapred.reduce.slowstart.completed.maps, reducers always start to 
execute when

about 10% of mappers are complete.

Do I set the right parameter? Is there any other parameter I can use 
for this

purpose?

Thanks,
Da






Re: InputSplit is confusing me .. Any clarifications ??

2010-11-28 Thread Steve Lewis
override *TextInputFormat and have *protected boolean *isSplitable*(
FileSystem
 fs,

  Path

filename) return false

This forces read of the entire file


On Sat, Nov 27, 2010 at 9:27 PM, Arun C Murthy  wrote:

> Moving to mapreduce-user@, bcc common-u...@. Please use project specific
> lists.
>
> Your InputSplits are defined by your InputFormat. Take a look 'getSplits'
> method in InputFormat.java.
>
> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Job+Input
>
> http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#InputSplit
>
> Arun
>
>
> On Nov 27, 2010, at 3:42 PM, maha wrote:
>
>  Sorry I mistyped LineRecordReader by LineInputSplit ... So here is my
>> question again ..
>>
>>
>>> Thanks for the reply .. although I read it has to do with "InputSplit"
>>> which represents the data to be processed by an individual Mapper. By
>>> default it's a LineRecordReader.
>>>
>>> How can I change this property to be FileInputSpilt and Record is the
>>> whole File ?
>>>
>>> something like JobConf.set ("File.input.format","FileInptSplit");
>>>
>>> Is there such way?
>>>
>>> Thanks in advance,
>>>  Maha
>>>
>>>
>>
>


-- 
Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA


Re: delay the execution of reducers

2010-11-28 Thread Arun C Murthy

Moving to mapreduce-user@, bcc common-u...@. Please use project
specific lists.

mapreduce.reduce.slowstart.completed.maps is the right knob. Which  
version of hadoop are you running? If it isn't working, please open a  
jira. Thanks.


Arun

On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:


Hello,

I found in Hadoop that reducers starts when a fraction of the number  
of mappers
is complete. However, in my case, I hope reducers to start only when  
all mappers
are complete. I searched for Hadoop configuration parameters, and  
found
mapred.reduce.slowstart.completed.maps, which seems to do what I  
want. But no

matter what value (0.99, 1.00, etc) I set to
mapred.reduce.slowstart.completed.maps, reducers always start to  
execute when

about 10% of mappers are complete.

Do I set the right parameter? Is there any other parameter I can use  
for this

purpose?

Thanks,
Da