Do you have two data stream that are tailing the same file? To verify:
telnet agent-host 9093
list
Make sure you are not streaming the same data twice.
regards,
Eric
On Oct 27, 2011, at 7:25 AM, AD wrote:
> I migrated to the filetailer.CharFileTailingAdaptorUTF8 which ensured
> newline breaks but i am now getting 2 of every entry. Put the below debug
> code into the demux parser, any ideas why i would be seeing 2 of every event
> in the log?
>
> throws Throwable {
> try {
> String lines[] = recordEntry.split("\\r?\\n");
> FileWriter out = new FileWriter("/tmp/demux.test",true);
> PrintWriter p = new PrintWriter(out);
> for(int i = 0; i < lines.length; i++) {
> log.warn("*** TRYING TO PARSE **** " + lines[i]);
> p.println(lines[i]);
> }
> p.close();
>
> On Thu, Oct 27, 2011 at 6:39 AM, AD <[email protected]> wrote:
> yep that was it thanks. For the "recordEntry" variable, is there any way to
> guarantee how this is structured? I am testing tailing a file and i notice
> recordEntry in the parser is around 20-30 lines from the logfile, is this
> expected? can we safely assume newline termination and just loop through? I
> was expecting 1 recordEntry for each line of the logfile but i think this has
> to do with it being a chukwaRecord not a log entry.
>
>
> On Thu, Oct 27, 2011 at 1:25 AM, Eric Yang <[email protected]> wrote:
> Do you have both chukwa-core.jar and chukwa-core-0.4.0.jar? Run jar tf
> chukwa-core-0.4.0.jar | grep TsProcessor2, does chukwa-core-0.4.0.jar file
> have TsProcessor2.class?
>
> regards,
> Eric
>
> On Oct 26, 2011, at 7:59 PM, AD wrote:
>
> > Hey Eric,
> >
> > So as a test, i copied TsProcessor.java to TsProcessor2.java. I changed
> > the references for TsProcessor to be TsProcessor2 and updated my
> > chukwa-demux.conf to be
> >
> > <property>
> > <name>TsProcessor</name>
> >
> > <value>org.apache.hadoop.chukwa.extraction.demux.processor.mapper.TsProcessor2</value>
> > <description>Parser class for </description>
> > </property>
> >
> > I then ran ant in the root and copied build/collector-0.4.0.war and
> > build/chukwa-core.jar to the root and started collector and demux.
> >
> > I am now getting the following errors, any ideas ?
> >
> > org.apache.hadoop.chukwa.extraction.demux.processor.mapper.UnknownRecordTypeException:
> > Unknown
> > parserClass:org.apache.hadoop.chukwa.extraction.demux.processor.mapper.TsProcessor2
> >
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.hadoop.chukwa.extraction.demux.processor.mapper.TsProcessor2
> >
> > On Wed, Oct 26, 2011 at 1:00 PM, Eric Yang <[email protected]> wrote:
> > Yes, write a mapper class which extends
> > org/apache/hadoop/chukwa/extraction/demux/processor/mapper/AbstractProcessor.java.
> > There are several extraction class in
> > org/apache/hadoop/chukwa/extraction/demux/processor/mapper as examples.
> > Once the extraction class is written, configure chukwa-demux-conf.xml to
> > map the data type to the new extraction class.
> >
> > regards,
> > Eric
> >
> > On Oct 26, 2011, at 5:48 AM, AD wrote:
> >
> > > interesting. So demux is now part of the collector and does not need to
> > > be run as another job?
> > >
> > > Since demux is doing basic ETL into hbase, is there a way to actually
> > > parse the fields of the log record and insert those into sep fields into
> > > Hbase to run mapreduce there (instead of the whole body as a field is
> > > what I think is happening)
> > >
> > > On Wed, Oct 26, 2011 at 12:59 AM, SALAMI Patrick <[email protected]>
> > > wrote:
> > > Ok, thanks!
> > >
> > > -----Original Message-----
> > > From: Eric Yang [mailto:[email protected]] On Behalf Of Eric Yang
> > > Sent: Tuesday, October 25, 2011 6:19 PM
> > > To: [email protected]
> > > Subject: Re: Chukwa trunk does not write to HBase
> > >
> > > Demuxer for Chukwa 0.4 is a mapreduce job acting as ETL process to
> > > convert the data into semi structure format for further processing. For
> > > Chukwa trunk with HBase, demux is running as part of collector for ETL.
> > > Hence, there is no need to run demux process.
> > >
> > > regards,
> > > Eric
> > >
> > > On Oct 25, 2011, at 3:49 PM, SALAMI Patrick wrote:
> > >
> > > > Also, while we are chatting, I was hoping to understand the role of the
> > > > demuxer. I am assuming that HICC pulls all of its data from HBase. If
> > > > using HBase, is it still necessary to run the demuxer? I didn't see any
> > > > mention of it in the latest quick start guide.
> > > >
> > > > Thanks!
> > > >
> > > > Patrick
> > > >
> > > > -----Original Message-----
> > > > From: Eric Yang [mailto:[email protected]] On Behalf Of Eric Yang
> > > > Sent: Tuesday, October 25, 2011 2:45 PM
> > > > To: [email protected]
> > > > Subject: Re: Chukwa trunk does not write to HBase
> > > >
> > > > Hadoop trunk will require a different configuration than the one
> > > > described in the Quick_Start guide.
> > > >
> > > > 1. Apply this patch: HADOOP-7436, and rebuild hadoop.
> > > > 2. Copy hadoop-metrics2.properties enclosed in this message to
> > > > HADOOP_CONF_DIR
> > > >
> > > > Restart hadoop. You might need to match the HBase table schema with
> > > > the metrics emitted by Hadoop metrics 2 framework.
> > > > Hope this helps.
> > > >
> > > >
> > > > This message contains confidential information and is intended only for
> > > > the individual(s) named. If you are not the named addressee you should
> > > > not disseminate, distribute or copy this e-mail. Please notify the
> > > > sender immediately by e-mail if you have received this e-mail by
> > > > mistake and delete this e-mail from your system.
> > >
> > >
> > > This message contains confidential information and is intended only for
> > > the individual(s) named. If you are not the named addressee you should
> > > not disseminate, distribute or copy this e-mail. Please notify the sender
> > > immediately by e-mail if you have received this e-mail by mistake and
> > > delete this e-mail from your system.
> > >
> >
> >
>
>
>