The changes are in both the 1.3 RC5 and in the 1.4 trunk
On 11/29/2012 01:26 PM, Mohit Anchlia wrote:
If I grab the last snapshot would I get these changes?
On Tue, Nov 20, 2012 at 3:24 PM, Mohit Anchlia <mohitanch...@gmail.com
<mailto:mohitanch...@gmail.com>> wrote:
that's awesome!
On Tue, Nov 20, 2012 at 3:11 PM, Mike Percy <mpe...@apache.org
<mailto:mpe...@apache.org>> wrote:
Mohit,
No problem, but Juhani did all the work. :)
The behavior is that you can configure an HDFS sink to close a
file if it hasn't gotten any writes in some time. After it's
been idle for 5 minutes or something, it gets closed. If you
get a "late" event that goes to the same path after the file
is closed, it will just create a new file in the same path as
usual.
Regards,
Mike
On Tue, Nov 20, 2012 at 12:56 PM, Brock Noland
<br...@cloudera.com <mailto:br...@cloudera.com>> wrote:
We are currently voting on a 1.3.0 RC on the dev@ list:
http://s.apache.org/OQ0W
You don't have to be a committer to vote! :)
Brock
On Tue, Nov 20, 2012 at 2:53 PM, Mohit Anchlia
<mohitanch...@gmail.com <mailto:mohitanch...@gmail.com>>
wrote:
> Thanks a lot!! Now with this what should be the expected
behaviour? After
> file is closed a new file is created for writes that
come after closing the
> file?
>
> Thanks again for committing this change. Do you know
when 1.3.0 is out? I am
> currently using the snapshot version of 1.3.0
>
> On Tue, Nov 20, 2012 at 11:16 AM, Mike Percy
<mpe...@apache.org <mailto:mpe...@apache.org>> wrote:
>>
>> Mohit,
>> FLUME-1660 is now committed and it will be in 1.3.0. In
the case where you
>> are using 1.2.0, I suggest running with
hdfs.rollInterval set so the files
>> will roll normally.
>>
>> Regards,
>> Mike
>>
>>
>> On Thu, Nov 15, 2012 at 11:23 PM, Juhani Connolly
>> <juhani_conno...@cyberagent.co.jp
<mailto:juhani_conno...@cyberagent.co.jp>> wrote:
>>>
>>> I am actually working on a patch for exactly this,
refer to FLUME-1660
>>>
>>> The patch is on review board right now, I fixed a
corner case issue that
>>> came up with unit testing, but the implementation is
not really to my
>>> satisfaction. If you are interested please have a look
and add your opinion.
>>>
>>> https://issues.apache.org/jira/browse/FLUME-1660
>>> https://reviews.apache.org/r/7659/
>>>
>>>
>>> On 11/16/2012 01:16 PM, Mohit Anchlia wrote:
>>>
>>> Another question I had was about rollover. What's the
best way to
>>> rollover files in reasonable timeframe? For instance
our path is YY/MM/DD/HH
>>> so every hour there is new file and the -1 hr is just
sitting with .tmp and
>>> it takes sometimes even hour before .tmp is closed and
renamed to .snappy.
>>> In this situation is there a way to tell flume to
rollover files sooner
>>> based on some idle time limit?
>>>
>>> On Thu, Nov 15, 2012 at 8:14 PM, Mohit Anchlia
<mohitanch...@gmail.com <mailto:mohitanch...@gmail.com>>
>>> wrote:
>>>>
>>>> Thanks Mike it makes sense. Anyway I can help?
>>>>
>>>>
>>>> On Thu, Nov 15, 2012 at 11:54 AM, Mike Percy
<mpe...@apache.org <mailto:mpe...@apache.org>> wrote:
>>>>>
>>>>> Hi Mohit, this is a complicated issue. I've filed
>>>>> https://issues.apache.org/jira/browse/FLUME-1714 to
track it.
>>>>>
>>>>> In short, it would require a non-trivial amount of
work to implement
>>>>> this, and it would need to be done carefully. I
agree that it would be
>>>>> better if Flume handled this case more gracefully
than it does today. Today,
>>>>> Flume assumes that you have some job that would go
and clean up the .tmp
>>>>> files as needed, and that you understand that they
could be partially
>>>>> written if a crash occurred.
>>>>>
>>>>> Regards,
>>>>> Mike
>>>>>
>>>>>
>>>>> On Sun, Nov 11, 2012 at 8:32 AM, Mohit Anchlia
<mohitanch...@gmail.com <mailto:mohitanch...@gmail.com>>
>>>>> wrote:
>>>>>>
>>>>>> What we are seeing is that if flume gets killed
either because of
>>>>>> server failure or other reasons, it keeps around
the .tmp file. Sometimes
>>>>>> for whatever reasons .tmp file is not readable. Is
there a way to rollover
>>>>>> .tmp file more gracefully?
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
--
Apache MRUnit - Unit testing MapReduce -
http://incubator.apache.org/mrunit/