On Wed, Sep 19, 2012 at 3:27 PM, Bilgin Ibryam <bibr...@gmail.com> wrote:
> Hi all,
>
> I've to fetch two files from an ftp server at the same time and merge them
> into one message for further processing.
> The merging part is easy by using an aggregator, but is there a way to
> ensure that both files are read from the ftl folder at the same moment?
>
> The reason for that is there is another process which writes to the ftp
> folder and replaces old files with new ones every couple of seconds. So it
> is possible that the second file is updated while the first file is being
> processed by the ftp consumer and thus  both messages being out of sync.
>
> Any suggestion how to approach it?
>

IMHO this sounds a bit odd/dangerous way of transferring data between parties.
There is no good way to tell if these 2 files are in sync, or the
first file has new content, and the 2nd has previous content.

It would be better if the other party could change its strategy to
upload new files using a name pattern to pair the files, eg
a sequence number that would match etc.

myfile-123.dat
myotherfile-123.dat

Or to use a 3rd "done" file to tell you that the 2 files has now been
properly written. eg deleting the done file first, write the 2 files,
and save the done file again.


Can you move the files before you download them? As a move operation
on a remote FTP server is faster than having to download the 2 files
in parallel in hopes that you make it before the other party starts to
override the files.

If you can move the files (or rename) then thats likely a near atomic
IO operation, and thus would be fast doing 2 file moves. Although you
would need to send 2 FTP commands.

Camel has a preMove option, but its still based on the
one-file-at-a-time strategy the ftp consumer uses.


If you know the file names in advance, then you can have 2 routes,
each route picking up the designated file. And if you are allowed to
delete the file after download, then you could assume if there is a
new file then its new content since last.

But the 2 sec frequency is a bit fast, which mean you would need to
download and delete the file < 2s. And have the polling frequency of
the ftp consumer aligned with the other party.



> Thanks
> Bilgin



-- 
Claus Ibsen
-----------------
FuseSource
Email: cib...@fusesource.com
Web: http://fusesource.com
Twitter: davsclaus, fusenews
Blog: http://davsclaus.com
Author of Camel in Action: http://www.manning.com/ibsen

Reply via email to