[CONF] Apache Camel > Splitter

2014-06-16 Thread Ben O7;Day (Confluence)














  


Ben O'Day edited the page:
 


Splitter   




 Comment: updated examples per SO discussion 


...
For further examples of this pattern in use you could look at one of the junit test case 
 Splitting a Collection, Iterator or Array 
 A common use case is to split a Collection, Iterator or Array from the message. In the sample below we simply use an _expression_ to identify the value to split. 



 Code Block








language
java


 




 
from("direct:splitUsingBody").split(body()).to("mock:result");

from("direct:splitUsingHeader").split(header("foo")).to("mock:result"); 
 



 In Spring XML you can use the Simple language to identify the value to split. 



 Code Block









xml


 




  

[CONF] Apache Camel > Recipient List

2014-03-10 Thread Ben O7;Day (Confluence)














  


Ben O'Day edited the page:
 


Recipient List   




 Comment: CAMEL-6665 disable delimiter support 


...



 Wiki Markup




 {div:class=confluenceTableSmall}
|| Name || Default Value || Description ||
| {{delimiter}} | {{,}} | Delimiter used if the [_expression_] returned multiple endpoints. *Camel 2.13* can be disabled using "false" |
| {{strategyRef}} | | An [AggregationStrategy|http://camel.apache.org/maven/current/camel-core/apidocs/org/apache/camel/processor/aggregate/AggregationStrategy.html] that will assemble the replies from recipients into a single outgoing message from the [Recipient List]. By default Camel will use the last reply as the outgoing message. From *Camel 2.12* onwards you can also use a POJO as the {{AggregationStrategy}}, see the [Aggregate|Aggregator2] page for more details. |
| {{strategyMethodName}} | | *Camel 2.12:* This option can be used to explicit declare the method name to use, when using POJOs as the {{AggregationStrategy}}. See the [Aggregate|Aggregator2] page for more details. |
| {{strategyMethodAllowNull}} | {{false}} | *Camel 2.12:* If this option is {{false}} then the aggregate method is not used if there was no data to enrich. If this option is {{true}} then {{null}} values is used as the {{oldExchange}} (when no data to enrich), when using POJOs as the {{AggregationStrategy}}. See the [Aggregate|Aggregator2] page for more details. |
| {{parallelProcessing}} | {{false}} | *Camel 2.2:* If enabled, messages are sent to the recipients concurrently. Note that the calling thread will still wait until all messages have been fully processed before it continues; it's the sending and processing of replies from recipients which happens in parallel. |
| {{executorServiceRef}} | | *Camel 2.2:* A custom [Thread Pool|Threading Model] to use for parallel processing. Note that enabling this option implies parallel processing, so you need not enable that option as well. |
| {{stopOnException}} | {{false}} | *Camel 2.2:* Whether to immediately stop processing when an exception occurs. If disabled, Camel will send the message to all recipients regardless of any individual failures. You can process exceptions in an [AggregationStrategy|http://camel.apache.org/maven/current/camel-core/apidocs/org/apache/camel/processor/aggregate/AggregationStrategy.html] implementation, which supports full control of error handling. |
| {{ignoreInvalidEndpoints}} | {{false}} | *Camel 2.3:* Whether to ignore an endpoint URI that could not be resolved. If disabled, Camel will throw an exception identifying the invalid endpoint URI. |
| {{streaming}} | {{false}} | *Camel 2.5:* If enabled, Camel will process replies out-of-order - that is, in the order received in reply from each recipient. If disabled, Camel will process replies in the same order as specified by the [_expression_]. |
| {{timeout}} | | *Camel 2.5:* Specifies a processing timeout milliseconds. If the [Recipient List] hasn't been able to send and process all replies within this timeframe, then the timeout triggers and the [Recipient List] breaks out, with message flow continuing to the next element. Note that if you provide a [TimeoutAwareAggregationStrategy|http://camel.apache.org/maven/current/camel-core/apidocs/org/apache/camel/processor/aggregate/TimeoutAwareAggregationStrategy.html], its {{timeout}} method is invok

[CONF] Apache Camel > HDFS

2013-11-07 Thread Ben O7;Day (Confluence)







HDFS
Page edited by Ben O'Day


Comment:
added split strategy note per CAMEL-6864


 Changes (1)
 




...
* IDLE a new file is created, and the old is closed when no writing happened in the last  milliseconds  
{note} note that this strategy currently requires either setting an IDLE value or setting the HdfsConstants.HDFS_CLOSE header to false to use the BYTES/MESSAGES configuration...otherwise, the file will be closed with each message {note}  
for example: {code} 
...


Full Content

HDFS Component
Available as of Camel 2.8

The hdfs component enables you to read and write messages from/to an HDFS file system. HDFS is the distributed file system at the heart of Hadoop.

Maven users will need to add the following dependency to their pom.xml for this component:



org.apache.camel
camel-hdfs
x.x.x





URI format



hdfs://hostname[:port][/path][?options]



You can append query options to the URI in the following format, ?option=value&option=value&...
The path is treated in the following way:

	as a consumer, if it's a file, it just reads the file, otherwise if it represents a directory it scans all the file under the path satisfying the configured pattern. All the files under that directory must be of the same type.
	as a producer, if at least one split strategy is defined, the path is considered a directory and under that directory the producer creates a different file per split named using the configured UuidGenerator.



Options



 Name 
 Default Value 
 Description 


 overwrite 
 true 
 The file can be overwritten 


 append 
 false 
 Append to existing file. Notice that not all HDFS file systems support the append option. 


 bufferSize 
 4096 
 The buffer size used by HDFS  


 replication 
 3 
 The HDFS replication factor  


 blockSize 
 67108864 
 The size of the HDFS blocks  


 fileType 
 NORMAL_FILE 
 It can be SEQUENCE_FILE, MAP_FILE, ARRAY_FILE, or BLOOMMAP_FILE, see Hadoop 


 fileSystemType 
 HDFS 
 It can be LOCAL for local filesystem  


 keyType 
 NULL 
 The type for the key in case of sequence or map files. See below.  


 valueType 
 TEXT 
 The type for the key in case of sequence or map files. See below.  


 splitStrategy 
 
 A string describing the strategy on how to split the file based on different criteria. See below.  


 openedSuffix 
 opened 
 When a file is opened for reading/writing the file is renamed with this suffix to avoid to read it during the writing phase. 


 readSuffix 
 read 
 Once the file has been read is renamed with this suffix to avoid to read it again.  


 initialDelay 
 0 
 For the consumer, how much to wait (milliseconds) before to start scanning the directory.  


 delay 
 0 
 The interval (milliseconds) between the directory scans. 


 pattern 
 * 
 The pattern used for scanning the directory  


 chunkSize 
 4096 
 When reading a normal file, this is split into chunks producing a message per chunk. 


 connectOnStartup 
 true 
 Camel 2.9.3/2.10.1: Whether to connect to the HDFS file system on starting the producer/consumer. If false then the connection is created on-demand. Notice that HDFS may take up till 15 minutes to establish a connection, as it has hardcoded 45 x 20 sec redelivery. By setting this option to false allows your application to startup, and not block for up till 15 minutes. 





KeyType and ValueType

	NULL it means that the key or the value is absent
	BYTE for writing a byte, the java Byte class is mapped into a BYTE
	BYTES for writing a sequence of bytes. It maps the java ByteBuffer class
	INT for writing java integer
	FLOAT for writing java float
	LONG for writing java long
	DOUBLE for writing java double
	TEXT for writing java strings



BYTES is also used with everything else, for example, in Camel a file is sent around as an InputStream, int this case is written in a sequence file or a map file as a sequence of bytes.

Splitting Strategy
In the current version of Hadoop opening a file in append mode is disabled since it's not very reliable. So, for the moment, it's only possible to create new files. The Camel HDFS endpoint tries to solve this problem in this way:

	If the split strategy option has been defined, the hdfs path will be used as a directory and files will be created using the configured UuidGenerator
	Every time a splitting condition is met, a new file is created.
The splitStrategy option is defined as a string with the following syntax:
splitStrategy=:,:,*



where  can be:

	BYTES a new file is created, and the old is closed when the number of written bytes is more than 
	MESSAGES a new file is created, and the old is closed when the number of written messages is more than 
	IDLE a new file is created, and the o

[CONF] Apache Camel > HDFS

2013-10-18 Thread Ben O7;Day (Confluence)







HDFS
Page edited by Ben O'Day


Comment:
update per CAMEL-6867 - using UUID generator for split filenames


 Changes (4)
 




...
The path is treated in the following way: # as a consumer, if it's a file, it just reads the file, otherwise if it represents a directory it scans all the file under the path satisfying the configured pattern. All the files under that directory must be of the same type. 
# as a producer, if at least one split strategy is defined, the path is considered a directory and under that directory the producer creates a different file per split named seg0, seg1, seg2, etc. using the configured [uuidgenerator]. 
 h3. Options 
...
h3. Splitting Strategy In the current version of Hadoop opening a file in append mode is disabled since it's not very reliable. So, for the moment, it's only possible to create new files. The Camel HDFS endpoint tries to solve this problem in this way: 
* If the split strategy option has been defined, the actual file name will become a directory name and a /seg0 will be initially created. * Every time a splitting condition is met a new file is created with name /segN where N is 1, 2, 3, etc. 
* If the split strategy option has been defined, the hdfs path will be used as a directory and files will be created using the configured [uuidgenerator]  * Every time a splitting condition is met, a new file is created. 
The splitStrategy option is defined as a string with the following syntax: splitStrategy=:,:,* 
...
hdfs://localhost/tmp/simple-file?splitStrategy=IDLE:1000,BYTES:5 {code} 
it means: a new file is created either when it has been idle for more than 1 second or if more than 5 bytes have been written. So, running {{hadoop fs -ls /tmp/simple-file}} you'll find the following multiple files seg0, seg1, seg2, created named using the [uuidgenerator], etc 
 h3. Message Headers 
...


Full Content

HDFS Component
Available as of Camel 2.8

The hdfs component enables you to read and write messages from/to an HDFS file system. HDFS is the distributed file system at the heart of Hadoop.

Maven users will need to add the following dependency to their pom.xml for this component:



org.apache.camel
camel-hdfs
x.x.x





URI format



hdfs://hostname[:port][/path][?options]



You can append query options to the URI in the following format, ?option=value&option=value&...
The path is treated in the following way:

	as a consumer, if it's a file, it just reads the file, otherwise if it represents a directory it scans all the file under the path satisfying the configured pattern. All the files under that directory must be of the same type.
	as a producer, if at least one split strategy is defined, the path is considered a directory and under that directory the producer creates a different file per split named using the configured UuidGenerator.



Options



 Name 
 Default Value 
 Description 


 overwrite 
 true 
 The file can be overwritten 


 append 
 false 
 Append to existing file. Notice that not all HDFS file systems support the append option. 


 bufferSize 
 4096 
 The buffer size used by HDFS  


 replication 
 3 
 The HDFS replication factor  


 blockSize 
 67108864 
 The size of the HDFS blocks  


 fileType 
 NORMAL_FILE 
 It can be SEQUENCE_FILE, MAP_FILE, ARRAY_FILE, or BLOOMMAP_FILE, see Hadoop 


 fileSystemType 
 HDFS 
 It can be LOCAL for local filesystem  


 keyType 
 NULL 
 The type for the key in case of sequence or map files. See below.  


 valueType 
 TEXT 
 The type for the key in case of sequence or map files. See below.  


 splitStrategy 
 
 A string describing the strategy on how to split the file based on different criteria. See below.  


 openedSuffix 
 opened 
 When a file is opened for reading/writing the file is renamed with this suffix to avoid to read it during the writing phase. 


 readSuffix 
 read 
 Once the file has been read is renamed with this suffix to avoid to read it again.  


 initialDelay 
 0 
 For the consumer, how much to wait (milliseconds) before to start scanning the directory.  


 delay 
 0 
 The interval (milliseconds) between the directory scans. 


 pattern 
 * 
 The pattern used for scanning the directory  


 chunkSize 
 4096 
 When reading a normal file, this is split into chunks producing a message per chunk. 


 connectOnStartup 
 true 
 Camel 2.9.3/2.10.1: Whether to connect to the HDFS file system on starting the producer/consumer. If false then the connection is created on-demand. Notice that HDFS may take up till 15 minutes to establish a connection, as it has hardcoded 45 x 20 sec redelivery. By s

[CONF] Apache Camel > Camel 2.13.0 Release

2013-10-17 Thread Ben O7;Day (Confluence)







Camel 2.13.0 Release
Page edited by Ben O'Day


Comment:
updated per CAMEL-6028


 Changes (1)
 




...
* [VM] component now supports {{multipleConsumers=true}} across deployment units. * Added {{@PreConsumed}} to [JPA] consumer. 
* Added CamelFileName header support to the [hdfs] producer 
 h3. Fixed Issues 
...


Full Content

Camel 2.13.0 release (currently in progress)




New and Noteworthy

Welcome to the 2.13.0 release which approx XXX issues resolved (new features, improvements and bug fixes such as...)


	When using multiple OSGi Blueprint 's then Camel now favors using non-default placeholders, or the last property-placeholder defined in the Blueprint XML file. This allows for example to define default properties in one placeholder, and override these values in other placeholders.
	FTP consumer allow to download a single named file without using the FTP LIST command. This allows to download a known file from a FTP server even when the user account does not have permission to do FTP LIST command.
	FTP consumer allow to ignore file not found or insufficient file permission errors.
	Data Format using marshal now leverages Stream caching out of the box if enabled, which allows to marshal big streams and spool to disk, instead of being pure in-memory based.
	Improved using Bean when the bean is looked up in the Registry, when using concurrent processing in the route.
	Added cache option to beanRef and  in the DSL. This avoids looking up the Bean from the Registry on each usage; this can safely be done for singleton beans.
	Configuring Data Formats in XML attributes now supports reference lookup using the # syntax, eg 
	JDBC component now also support outputType to specify the expected output as either a List or single Object. As well allow to map to a bean using a BeanRowMapper to control the mapping of ROW names to bean properties.
	Both Quartz as well as Quartz2 based ScheduledRoutePolicy has been improved to better support cluster setups (e.g. to not schedule jobs being already scheduled through another node inside a given cluster).
	Reduced the work the Aggregate EIP does while holding a lock during aggregation, which can lead to improved performance in some use-cases.
	JndiRegistry now implements all the find methods.
	VM component now supports multipleConsumers=true across deployment units.
	Added @PreConsumed to JPA consumer.
	Added CamelFileName header support to the HDFS producer



Fixed Issues


	Fixed an ArrayIndexOutOfBoundsException with Message History when using SEDA
	Fixed requestTimeout on Netty not triggering when we have received message.
	Fixed Parameter Binding Annotations on boolean types to evaluate as Predicate instead of _expression_
	Fixed using File consumer with delete=true&readLock=fileLock not being able to delete the file on Windows.
	Fixed Throttler to honor time slots after period expires (eg so it works consistently and as expected).
	Fixed getting JMSXUserID property when consuming from ActiveMQ
	Fixed interceptFrom to support property placeholders
	Fixed a race condition in initializing SSLContext in Netty and Netty HTTP
	Fixed using Recipient List, Routing Slip calling another route which is configured with NoErrorHandler, and an exception occurred in that route, would be propagated back as not-exhausted, allow the caller route to have its error handler react on the exception.
	Fixed Quartz and exception was thrown when scheduling a job, would affect during shutdown, assuming the job was still in progress, and not shutdown the Quartz scheduler.
	Fixed so you can configure Stomp endpoints using URIs
	Fixed memory leak when using Language component with camel-script languages and having contentCache=false
	Fixed Error Handler may log at WARN level "Cannot determine current route from Exchange" when using Splitter



New Enterprise Integration Patterns

New Components


	camel-infinispan
	camel-splunk - enables you to publish and search for events in Splunk



New Camel Maven Archetypes


	camel-archetype-cxf-code-first-blueprint
	camel-archetype-cxf-contract-first-blueprint



New DSL

New Annotations

New Data Formats

New Languages


	JSonPath - To perform _expression_ and Predicate on json payloads.



New Examples

New Tutorials

API changes

Known Issues

Dependency Upgrades


	APNS 0.1.6 to 0.2.3
	BeanIO 2.0.6 to 2.0.7
	CXF 2.7.6 to 2.7.7
	EHCache 2.7.2 to 2.7.4
	Elasticsearch 0.20.6 to 0.90.3
	FOP 1.0 to 1.1
	Guave 14.0.1 to 15.0
	Hazelcast 2.6 to 3.0.2
	ICal4j 1.0.4 to 1.0.5.2
	Jetty 7.6.9 to 8.1.12
	Joda time 2.1 to 2.3
	JRuby 1.7.4 to 1.7.5
	Lucene 3.6.0 to 4.4.0
	MongoDB Java Driver 2.11.2 to 2.11.3
	Quartz 2.2.0 to 2.2.1
	Restlet 2.0.15 to 2.1.4
	Saxon 9.5.0.2 to 9.5.1-2

[CONF] Apache Camel > HDFS

2013-10-17 Thread Ben O7;Day (Confluence)







HDFS
Page edited by Ben O'Day


Comment:
updated per CAMEL-6028


 Changes (2)
 




...
it means: a new file is created either when it has been idle for more than 1 second or if more than 5 bytes have been written. So, running {{hadoop fs -ls /tmp/simple-file}} you'll find the following files seg0, seg1, seg2, etc  
h3. Message Headers 
 
The following headers are supported by this component:  h4. Producer only {div:class=confluenceTableSmall} || Header || Description || | {{CamelFileName}} | *Camel 2.13:* Specifies the name of the file to write (relative to the endpoint path). The name can be a {{String}} or an [_expression_] object. Only relevant when not using a split strategy. | {div}  
h3. Controlling to close file stream *Available as of Camel 2.10.4* 
...


Full Content

HDFS Component
Available as of Camel 2.8

The hdfs component enables you to read and write messages from/to an HDFS file system. HDFS is the distributed file system at the heart of Hadoop.

Maven users will need to add the following dependency to their pom.xml for this component:



org.apache.camel
camel-hdfs
x.x.x





URI format



hdfs://hostname[:port][/path][?options]



You can append query options to the URI in the following format, ?option=value&option=value&...
The path is treated in the following way:

	as a consumer, if it's a file, it just reads the file, otherwise if it represents a directory it scans all the file under the path satisfying the configured pattern. All the files under that directory must be of the same type.
	as a producer, if at least one split strategy is defined, the path is considered a directory and under that directory the producer creates a different file per split named seg0, seg1, seg2, etc.



Options



 Name 
 Default Value 
 Description 


 overwrite 
 true 
 The file can be overwritten 


 append 
 false 
 Append to existing file. Notice that not all HDFS file systems support the append option. 


 bufferSize 
 4096 
 The buffer size used by HDFS  


 replication 
 3 
 The HDFS replication factor  


 blockSize 
 67108864 
 The size of the HDFS blocks  


 fileType 
 NORMAL_FILE 
 It can be SEQUENCE_FILE, MAP_FILE, ARRAY_FILE, or BLOOMMAP_FILE, see Hadoop 


 fileSystemType 
 HDFS 
 It can be LOCAL for local filesystem  


 keyType 
 NULL 
 The type for the key in case of sequence or map files. See below.  


 valueType 
 TEXT 
 The type for the key in case of sequence or map files. See below.  


 splitStrategy 
 
 A string describing the strategy on how to split the file based on different criteria. See below.  


 openedSuffix 
 opened 
 When a file is opened for reading/writing the file is renamed with this suffix to avoid to read it during the writing phase. 


 readSuffix 
 read 
 Once the file has been read is renamed with this suffix to avoid to read it again.  


 initialDelay 
 0 
 For the consumer, how much to wait (milliseconds) before to start scanning the directory.  


 delay 
 0 
 The interval (milliseconds) between the directory scans. 


 pattern 
 * 
 The pattern used for scanning the directory  


 chunkSize 
 4096 
 When reading a normal file, this is split into chunks producing a message per chunk. 


 connectOnStartup 
 true 
 Camel 2.9.3/2.10.1: Whether to connect to the HDFS file system on starting the producer/consumer. If false then the connection is created on-demand. Notice that HDFS may take up till 15 minutes to establish a connection, as it has hardcoded 45 x 20 sec redelivery. By setting this option to false allows your application to startup, and not block for up till 15 minutes. 





KeyType and ValueType

	NULL it means that the key or the value is absent
	BYTE for writing a byte, the java Byte class is mapped into a BYTE
	BYTES for writing a sequence of bytes. It maps the java ByteBuffer class
	INT for writing java integer
	FLOAT for writing java float
	LONG for writing java long
	DOUBLE for writing java double
	TEXT for writing java strings



BYTES is also used with everything else, for example, in Camel a file is sent around as an InputStream, int this case is written in a sequence file or a map file as a sequence of bytes.

Splitting Strategy
In the current version of Hadoop opening a file in append mode is disabled since it's not very reliable. So, for the moment, it's only possible to create new files. The Camel HDFS endpoint tries to solve this problem in this way:

	If the split strategy option has been defined, the actual file name will become a directory name and a /seg0 will be initially created.
	Every time a splitting condition is met a new file is created with name /segN where N is 1, 2, 3, etc.
The splitStrategy opti

[CONF] Apache Camel > ElasticSearch

2013-08-07 Thread Ben O7;Day (Confluence)







ElasticSearch
Page edited by Ben O'Day


Comment:
updated per CAMEL-6444


 Changes (1)
 




...
|indexName|the name of the index to act against  |indexType|the type of the index to act against 
|ip|the TransportClient remote host ip to use *Camel 2.12* |port|the TransportClient remote port to use (defaults to 9300) *Camel 2.12* 
 h3. Message Operations 
...


Full Content

ElasticSearch Component
Available as of Camel 2.11

The ElasticSearch component allows you to interface with an ElasticSearch server.

Maven users will need to add the following dependency to their pom.xml for this component:



org.apache.camel
camel-elasticsearch
x.x.x





URI format



elasticsearch://[clusterName]?[options]



if you want to run against a local (in JVM/classloader) ElasticSearch server, just set the clusterName value in the URI to "local".  See the client guide for more details.

Endpoint Options

The following options may be configured on the ElasticSearch endpoint.  All are required to be set as either an endpoint URI parameter or as a header (headers override endpoint properties)




name 
description


operation
required, indicates the operation to perform


indexName
the name of the index to act against


indexType
the type of the index to act against


ip
the TransportClient remote host ip to use Camel 2.12


port
the TransportClient remote port to use (defaults to 9300) Camel 2.12





Message Operations

The following ElasticSearch operations are currently supported.  Simply set an endpoint URI option or exchange header with a key of "operation" and a value set to one of the following.  Some operations also require other parameters or the message body to be set.




operation 
message body 
description


INDEX
Map, String, byte[] or XContentBuilder content to index
adds content to an index and returns the content's indexId in the body


GET_BY_ID
index id of content to retrieve
retrives the specified index and returns a GetResult object in the body


DELETE
index id of content to delete
deletes the specified indexId and returns a DeleteResult object in the body





Index Example

Below is a simple INDEX example



from("direct:index")
.to("elasticsearch://local?operation=INDEX&indexName=twitter&indexType=tweet");












A client would simply need to pass a body message containing a Map to the route.  The result body contains the indexId created.



Map map = new HashMap();
map.put("content", "test");
String indexId = template.requestBody("direct:index", map, String.class);



For more information, see these resources

ElasticSearch Main Site

ElasticSearch Java API

See Also

	Configuring Camel
	Component
	Endpoint
	Getting Started





Stop watching space
|
Change email notification preferences

View Online
|
View Changes
|
Add Comment