Re: 2 questions, the log file name and the log messy code

Ying Tang Mon, 22 Nov 2010 23:50:35 -0800

The messy code is my mistake.
After using the SequenceFileInputFormat  ,the file is clear .
But the metadata in value is mixed with my log .
Add a \n after the metadata is better.


On Sat, Nov 20, 2010 at 2:24 AM, Jerome Boulon <[email protected]> wrote:

> Just a warning if you are using Text output format then you will have some
> hard time with “\n” inside your logs like stackTrace for example.
> Also, text file will either be non-compressed or non-splittable.
>
> /Jerome.
>
>
> On 11/19/10 9:30 AM, "Eric Yang" <[email protected]> wrote:
>
>
>
>
> On 11/19/10 12:37 AM, "Ying Tang" <[email protected]> wrote:
>
> Hi all ,
>     1.   I have install 2 nodes chukwa for testing , one agent and one
> collector  . And also i have an hdfs , but i found the log collected by the
> collector in hdfs , the file name is
>           time+logsourcehost+java.rmi.server.UID()
>           time's format is yyyyddHHmmssSSS , there is no month ? And this
> is been written in the code .
>     I      need the month  ,  so i must change the code and recompile it ?
>     2.   And another question , the log content in the log file(in the
> hdfs) , the metadata is messy code , the log content from the agent is ok.
>           My adaptor is UTF8 , how to solve this?
>
>
>
>    1. Looks like a mistake on the temp filename.  Please open a jira and
>    we will fix it.
>    2. The data is recorded in sequence file format to make the data easier
>    to process with mapreduce.  If you are expecting plain text of the log
>    content, you will need to write a map/reduce job with output format to text
>    output format and channel the log files types according.
>
>
> Regards,
> Eric
>
>


-- 
Best regards,

Ivy Tang

Re: 2 questions, the log file name and the log messy code

Reply via email to