I would say yes make this a Jira.
The actual change can fall (as proposed by Jay) in two directions: Put
in synchronization
in all implementations OR take it out of all implementations.

I think the first thing to determine is why the synchronization was put
into the  LineRecordWriter in the first place.

https://github.com/apache/hadoop-common/blame/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/TextOutputFormat.java

The oldest I have been able to find is a commit on 2009-05-18 for
HADOOP-4687 that is about moving stuff around (i.e. this code is even older
than that).

Niels



On Thu, Aug 8, 2013 at 2:21 PM, Sathwik B P <sath...@apache.org> wrote:

> Hi Harsh,
>
> Do you want me to raise a Jira on this.
>
> regards,
> sathwik
>
>
> On Thu, Aug 8, 2013 at 5:23 PM, Jay Vyas <jayunit...@gmail.com> wrote:
>
>> Then is this a bug?  Synchronization in absence of any race condition is
>> normally considered "bad".
>>
>> In any case id like to know why this writer is synchronized whereas the
>> other one are not.. That is, I think, then point at issue: either other
>> writers should be synchronized or else this one shouldn't be - consistency
>> across the write implementations is probably desirable so that changes to
>> output formats or record writers don't lead to bugs in multithreaded
>> environments .
>>
>> On Aug 8, 2013, at 6:50 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> While we don't fork by default, we do provide a MultithreadedMapper
>> implementation that would require such synchronization. But if you are
>> asking is it necessary, then perhaps the answer is no.
>> On Aug 8, 2013 3:43 PM, "Azuryy Yu" <azury...@gmail.com> wrote:
>>
>>> its not hadoop forked threads, we may create a line record writer, then
>>> call this writer concurrently.
>>> On Aug 8, 2013 4:00 PM, "Sathwik B P" <sathwik...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> Thanks for your reply.
>>>> May I know where does hadoop fork multiple threads to use a single
>>>> RecordWriter.
>>>>
>>>> regards,
>>>> sathwik
>>>>
>>>> On Thu, Aug 8, 2013 at 7:06 AM, Azuryy Yu <azury...@gmail.com> wrote:
>>>>
>>>>> because we may use multi-threads to write a single file.
>>>>> On Aug 8, 2013 2:54 PM, "Sathwik B P" <sath...@apache.org> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> LineRecordWriter.write(..) is synchronized. I did not find any other
>>>>>> RecordWriter implementations define the write as synchronized.
>>>>>> Any specific reason for this.
>>>>>>
>>>>>> regards,
>>>>>> sathwik
>>>>>>
>>>>>
>>>>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Reply via email to