[
https://issues.apache.org/jira/browse/CHUKWA-369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739795#action_12739795
]
Jerome Boulon commented on CHUKWA-369:
--------------------------------------
Regarding the issue with .chukwa files, the new LocalWriter is taking care of
this. Any file older than the rotation period +1min will be rename and send
over to HDFS.
@Ari: there's one thing I don't understand. Since there's more than one client
writing to the same SeqFile, How do you know that the 2 additional MBs that you
are seeing on the file is comming from Client1 and not Client2? Also keep in
mind that in order to improve performance, most of the time you will have to
buffer data in memory first then write in big chunk to disk.
This is what HDFS is doing and from what I know there's no easy way to figure
out if the data is still in memory or has been written to disk (At least for
now).
So unless you are able to keep track of the last SeqID per RecordType/Agent at
the collector side and then figure out what has been push to disk and what is
still in memory, I don't see a way to send the right information back to the
Agent.
> proposed reliability mechanism
> ------------------------------
>
> Key: CHUKWA-369
> URL: https://issues.apache.org/jira/browse/CHUKWA-369
> Project: Hadoop Chukwa
> Issue Type: New Feature
> Components: data collection
> Affects Versions: 0.3.0
> Reporter: Ari Rabkin
> Fix For: 0.3.0
>
>
> We like to say that Chukwa is a system for reliable log collection. It isn't,
> quite, since we don't handle collector crashes. Here's a proposed
> reliability mechanism.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.