Updated Branches: refs/heads/flume-1.5 dee057863 -> 9ef11916c
FLUME-2185. Upgrade morphlines to 0.7.0 (Wolfgang Hoschek via Hari Shreedharan) Project: http://git-wip-us.apache.org/repos/asf/flume/repo Commit: http://git-wip-us.apache.org/repos/asf/flume/commit/9ef11916 Tree: http://git-wip-us.apache.org/repos/asf/flume/tree/9ef11916 Diff: http://git-wip-us.apache.org/repos/asf/flume/diff/9ef11916 Branch: refs/heads/flume-1.5 Commit: 9ef11916c8f3f7922b673f72c2ae6a410b26b48d Parents: dee0578 Author: Hari Shreedharan <hshreedha...@apache.org> Authored: Mon Sep 16 15:38:13 2013 -0700 Committer: Hari Shreedharan <hshreedha...@apache.org> Committed: Mon Sep 16 15:39:25 2013 -0700 ---------------------------------------------------------------------- flume-ng-doc/sphinx/FlumeUserGuide.rst | 4 ++-- .../flume-ng-morphline-solr-sink/pom.xml | 20 +------------------- 2 files changed, 3 insertions(+), 21 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flume/blob/9ef11916/flume-ng-doc/sphinx/FlumeUserGuide.rst ---------------------------------------------------------------------- diff --git a/flume-ng-doc/sphinx/FlumeUserGuide.rst b/flume-ng-doc/sphinx/FlumeUserGuide.rst index c614991..bbfb5d0 100644 --- a/flume-ng-doc/sphinx/FlumeUserGuide.rst +++ b/flume-ng-doc/sphinx/FlumeUserGuide.rst @@ -1839,7 +1839,7 @@ This sink extracts data from Flume events, transforms it, and loads it in near-r This sink is well suited for use cases that stream raw data into HDFS (via the HdfsSink) and simultaneously extract, transform and load the same data into Solr (via MorphlineSolrSink). In particular, this sink can process arbitrary heterogeneous raw data from disparate data sources and turn it into a data model that is useful to Search applications. -The ETL functionality is customizable using a `morphline configuration file <http://cloudera.github.io/cdk/docs/0.4.1/cdk-morphlines/index.html>`_ that defines a chain of transformation commands that pipe event records from one command to another. +The ETL functionality is customizable using a `morphline configuration file <http://cloudera.github.io/cdk/docs/current/cdk-morphlines/index.html>`_ that defines a chain of transformation commands that pipe event records from one command to another. Morphlines can be seen as an evolution of Unix pipelines where the data model is generalized to work with streams of generic records, including arbitrary binary payloads. A morphline command is a bit like a Flume Interceptor. Morphlines can be embedded into Hadoop components such as Flume. @@ -2599,7 +2599,7 @@ prefix "" The prefix string constant to prepend to each generat Morphline Interceptor ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This interceptor filters the events through a `morphline configuration file <http://cloudera.github.io/cdk/docs/0.4.1/cdk-morphlines/index.html>`_ that defines a chain of transformation commands that pipe records from one command to another. +This interceptor filters the events through a `morphline configuration file <http://cloudera.github.io/cdk/docs/current/cdk-morphlines/index.html>`_ that defines a chain of transformation commands that pipe records from one command to another. For example the morphline can ignore certain events or alter or insert certain event headers via regular expression based pattern matching, or it can auto-detect and set a MIME type via Apache Tika on events that are intercepted. For example, this kind of packet sniffing can be used for content based dynamic routing in a Flume topology. MorphlineInterceptor can also help to implement dynamic routing to multiple Apache Solr collections (e.g. for multi-tenancy). http://git-wip-us.apache.org/repos/asf/flume/blob/9ef11916/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml ---------------------------------------------------------------------- diff --git a/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml b/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml index a2fb931..b2640d9 100644 --- a/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml +++ b/flume-ng-sinks/flume-ng-morphline-solr-sink/pom.xml @@ -33,8 +33,7 @@ limitations under the License. <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <solr.version>4.3.0</solr.version> <solr.expected.version>4.3.0</solr.expected.version> <!-- sanity check to verify we actually run against the expected version rather than some outdated version --> - <tika.version>1.3</tika.version> - <cdk.version>0.6.0</cdk.version> + <cdk.version>0.7.0</cdk.version> <slf4j.version>1.6.1</slf4j.version> <surefire.version>2.12.4</surefire.version> </properties> @@ -108,23 +107,6 @@ limitations under the License. </exclusions> </dependency> - <dependency> <!-- see http://tika.apache.org --> - <groupId>org.apache.tika</groupId> - <artifactId>tika-xmp</artifactId> - <version>${tika.version}</version> - <scope>test</scope> - <exclusions> - <exclusion> - <groupId>org.apache.geronimo.specs</groupId> - <artifactId>geronimo-stax-api_1.0_spec</artifactId> <!-- needed by tika-parsers but already provided by JDK --> - </exclusion> - <exclusion> - <groupId>xerces</groupId> - <artifactId>xercesImpl</artifactId> <!-- used by com.drewnoakes:metadata-extractor:jar but replacing built-in XML parser with legacy xerces is scary and probably don't need it --> - </exclusion> - </exclusions> - </dependency> - <dependency> <groupId>com.cloudera.cdk</groupId> <artifactId>cdk-morphlines-solr-core</artifactId>