[
https://issues.apache.org/jira/browse/FLUME-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Will McQueen updated FLUME-1233:
--------------------------------
Description:
Steps:
1) Create a flume.conf file that specifies an bucket path with an escape
sequence. Here's a partial config file:
agent.sinks.k1.channel = c1
agent.sinks.k1.type = HDFS
#agent.sinks.k1.hdfs.round = true
#agent.sinks.k1.hdfs.roundUnit = minute
#agent.sinks.k1.hdfs.roundValue = 2
agent.sinks.k1.hdfs.path = hdfs://blah.example.com/blah-test-ch01-%{host}
agent.sinks.k1.hdfs.fileType = DataStream
agent.sinks.k1.hdfs.rollInterval = 0
agent.sinks.k1.hdfs.rollSize = 0
agent.sinks.k1.hdfs.rollCount = 0
agent.sinks.k1.hdfs.batchSize = 1000
agent.sinks.k1.hdfs.txnEventMax = 1000
2) Try to send an event that has a timestamp in its header (HINT: you can use
an interceptor to add a timestamp to the header of all events generated by
SequenceGeneratorSource)
You'll see ERROR (exceptions) in the log.
2012-05-29 09:35:20,343 INFO hdfs.BucketWriter: Creating
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp
2012-05-29 09:35:20,359 ERROR hdfs.HDFSEventSink: process failed
java.lang.IllegalArgumentException: Pathname /blah-test-ch08-Tue May 29
09:35:18 2012/FlumeData.1274356498034827.tmp from
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:165)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:219)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
at
org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:60)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:121)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:179)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
at
org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-05-29 09:35:20,361 ERROR flume.SinkRunner: Unable to deliver event.
Exception follows.
org.apache.flume.EventDeliveryException: java.lang.IllegalArgumentException:
Pathname /blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp from
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:469)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: Pathname /blah-test-ch08-Tue May
29 09:35:18 2012/FlumeData.1274356498034827.tmp from
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:165)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:219)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
at
org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:60)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:121)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:179)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
at
org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 1 more
NOTE: According to the docs, %c is "locale's date and time", and the example it
give is "Thu Mar 3 23:05:25 2005".
was:
Steps:
1) Create a flume.conf file that specifies an bucket path with an escape
sequence. Here's a partial config file:
agent.sinks.k1.channel = c1
agent.sinks.k1.type = HDFS
#agent.sinks.k1.hdfs.round = true
#agent.sinks.k1.hdfs.roundUnit = minute
#agent.sinks.k1.hdfs.roundValue = 2
agent.sinks.k1.hdfs.path = hdfs://blah.example.com/blah-test-ch01-%{host}
agent.sinks.k1.hdfs.fileType = DataStream
agent.sinks.k1.hdfs.rollInterval = 0
agent.sinks.k1.hdfs.rollSize = 0
agent.sinks.k1.hdfs.rollCount = 0
agent.sinks.k1.hdfs.batchSize = 1000
agent.sinks.k1.hdfs.txnEventMax = 1000
2) Tr7 to send an event that has a timestamp in its header (HINT: you can use
an interceptor to add a timestamp to the header of all events generated by
SequenceGeneratorSource)
You'll see ERROR (exceptions) in the log.
2012-05-29 09:35:20,343 INFO hdfs.BucketWriter: Creating
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp
2012-05-29 09:35:20,359 ERROR hdfs.HDFSEventSink: process failed
java.lang.IllegalArgumentException: Pathname /blah-test-ch08-Tue May 29
09:35:18 2012/FlumeData.1274356498034827.tmp from
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:165)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:219)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
at
org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:60)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:121)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:179)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
at
org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2012-05-29 09:35:20,361 ERROR flume.SinkRunner: Unable to deliver event.
Exception follows.
org.apache.flume.EventDeliveryException: java.lang.IllegalArgumentException:
Pathname /blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp from
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
at
org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:469)
at
org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.IllegalArgumentException: Pathname /blah-test-ch08-Tue May
29 09:35:18 2012/FlumeData.1274356498034827.tmp from
hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:165)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:219)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
at
org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:60)
at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:121)
at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:179)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
at
org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
at
org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 1 more
> HDFS Sink has problem with %c escape sequence in bucket path
> ------------------------------------------------------------
>
> Key: FLUME-1233
> URL: https://issues.apache.org/jira/browse/FLUME-1233
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.2.0
> Environment: CentOS 5.6 64-bit
> Reporter: Will McQueen
> Fix For: v1.2.0
>
>
> Steps:
> 1) Create a flume.conf file that specifies an bucket path with an escape
> sequence. Here's a partial config file:
> agent.sinks.k1.channel = c1
> agent.sinks.k1.type = HDFS
> #agent.sinks.k1.hdfs.round = true
> #agent.sinks.k1.hdfs.roundUnit = minute
> #agent.sinks.k1.hdfs.roundValue = 2
> agent.sinks.k1.hdfs.path = hdfs://blah.example.com/blah-test-ch01-%{host}
> agent.sinks.k1.hdfs.fileType = DataStream
> agent.sinks.k1.hdfs.rollInterval = 0
> agent.sinks.k1.hdfs.rollSize = 0
> agent.sinks.k1.hdfs.rollCount = 0
> agent.sinks.k1.hdfs.batchSize = 1000
> agent.sinks.k1.hdfs.txnEventMax = 1000
> 2) Try to send an event that has a timestamp in its header (HINT: you can use
> an interceptor to add a timestamp to the header of all events generated by
> SequenceGeneratorSource)
> You'll see ERROR (exceptions) in the log.
> 2012-05-29 09:35:20,343 INFO hdfs.BucketWriter: Creating
> hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
> 2012/FlumeData.1274356498034827.tmp
> 2012-05-29 09:35:20,359 ERROR hdfs.HDFSEventSink: process failed
> java.lang.IllegalArgumentException: Pathname /blah-test-ch08-Tue May 29
> 09:35:18 2012/FlumeData.1274356498034827.tmp from
> hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
> 2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:165)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:219)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
> at
> org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:60)
> at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:121)
> at
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:179)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 2012-05-29 09:35:20,361 ERROR flume.SinkRunner: Unable to deliver event.
> Exception follows.
> org.apache.flume.EventDeliveryException: java.lang.IllegalArgumentException:
> Pathname /blah-test-ch08-Tue May 29 09:35:18
> 2012/FlumeData.1274356498034827.tmp from
> hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
> 2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
> at
> org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:469)
> at
> org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.IllegalArgumentException: Pathname /blah-test-ch08-Tue
> May 29 09:35:18 2012/FlumeData.1274356498034827.tmp from
> hdfs://blah.example.com/blah-test-ch08-Tue May 29 09:35:18
> 2012/FlumeData.1274356498034827.tmp is not a valid DFS filename.
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:165)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:219)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464)
> at
> org.apache.flume.sink.hdfs.HDFSDataStream.open(HDFSDataStream.java:60)
> at org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:121)
> at
> org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:179)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:432)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink$1.doCall(HDFSEventSink.java:429)
> at
> org.apache.flume.sink.hdfs.HDFSEventSink$ProxyCallable.call(HDFSEventSink.java:164)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> ... 1 more
> NOTE: According to the docs, %c is "locale's date and time", and the example
> it give is "Thu Mar 3 23:05:25 2005".
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira