Hi all. I have an ExecSource doing a tail -F on a log4J log file for an app, copying it into HDFS. I get no errors/warnings/exceptions from the Flume nodes, but when I went to make sure that indeed the contents of the files matched, I found that they did not. :( I tested several days worth of files, and none matched. I'm not sure where to even start looking at this discrepancy. Does anyone have any thoughts?
If I would have come across some errors somewhere, I would understand some differences, but for everything to appear to work fine, and then not match up, that concerns me. Thank you very much for any input. Chris In HDFS from Flume, file size in lines: [root@hadoopnn01 ~]# time sudo -u hdfs hadoop fs -text /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.* | wc -l 2812850 Actual source file size in lines: cneal@pegslog14[504]:/pegs/logcabin01/udprodae01/pegs/logs/udprodae01/d1c1_udprodae01/UD> time wc -l UDXMLTrans.log.2013-07-27 2812843 UDXMLTrans.log.2013-07-27 The source file: cneal@pegslog14[505]:/pegs/logcabin01/udprodae01/pegs/logs/udprodae01/d1c1_udprodae01/UD> ls -l UDXMLTrans.log.2013-07-27 -rw-r--r-- 1 logger other 19228787343 Jul 28 00:00 UDXMLTrans.log.2013-07-27 The files in HDFS: [root@hadoopnn01 ~]# time sudo -u hdfs hadoop fs -ls /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.* Found 1 items -rw-r--r-- 3 flume supergroup 200021549 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_1.1374883211499.gz Found 1 items -rw-r--r-- 3 flume supergroup 195398211 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_10.1374883210982.gz Found 1 items -rw-r--r-- 3 root supergroup 193557330 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_13.1374883212709.gz Found 1 items -rw-r--r-- 3 root supergroup 194163091 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_14.1374883212712.gz Found 1 items -rw-r--r-- 3 flume supergroup 192546288 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_2.1374883211446.gz Found 1 items -rw-r--r-- 3 root supergroup 191863735 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_5.1374883208056.gz Found 1 items -rw-r--r-- 3 root supergroup 196733297 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_6.1374883208056.gz Found 1 items -rw-r--r-- 3 flume supergroup 193451845 2013-07-28 00:00 /pegs/logs/udprodae01/d1c1_udprodae01/UD/UDTrans/2013-07-27/UDXMLTrans.log.2013-07-27_9.1374883210989.gz