I'm attempting to consume messages from an FTP server using an idempotent
repository to ensure that I do not re-download a file unless it has been
modified.

Here is my (quite simple) camel configuration:
        <beans:bean id="downloadRepo"
class="org.apache.camel.processor.idempotent.FileIdempotentRepository" >
                <beans:property name="fileStore" value="/tmp/.repo.txt"/>
                <beans:property name="cacheSize" value="25000"/>
                <beans:property name="maxFileStoreSize" value="1000000"/>
        </beans:bean>

        <camelContext trace="true" 
xmlns="http://camel.apache.org/schema/spring";>
                <endpoint id="myFtpEndpoint"
uri="ftp://me@localhost?password=****&binary=true&recursive=true&consumer.delay=15000&readLock=changed&passiveMode=true&noop=true&idempotentRepository=#downloadRepo&idempotentKey=$simple{file:name}-$simple{file:modified}";
/>
                <endpoint id="myFileEndpoint" uri="file:///tmp/files"/>

        <route>
            <from uri="ref:myFtpEndpoint" />
            <to uri="ref:myFileEndpoint" />
        </route>

When I start my application for the first time, all files are correctly
downloaded from the FTP server and stored in the target directory, as well
as recorded in the idempotent repo.

When I restart my application, all files are correctly detected as being in
the idempotent repo already on the first poll of the FTP server, and are not
re-downloaded:
13-11-04 16:52:10,811 TRACE [Camel (camel-1) thread #0 - ftp://me@localhost]
org.apache.camel.component.file.remote.FtpConsumer: FtpFile[name=test1.txt,
dir=false, file=true]
2013-11-04 16:52:10,811 TRACE [Camel (camel-1) thread #0 -
ftp://me@localhost] org.apache.camel.component.file.remote.FtpConsumer: This
consumer is idempotent and the file has been consumed before. Will skip this
file: RemoteFile[test1.txt]

However, on all subsequent polls to the FTP server the idempotent check is
short-circuited because the file is "in-progress":
2013-11-04 16:53:10,886 TRACE [Camel (camel-1) thread #0 -
ftp://me@localhost] org.apache.camel.component.file.remote.FtpConsumer:
FtpFile[name=test1.txt, dir=false, file=true]
2013-11-04 16:53:10,886 TRACE [Camel (camel-1) thread #0 -
ftp://me@localhost] org.apache.camel.component.file.remote.FtpConsumer:
Skipping as file is already in progress: test1.txt

I am using camel-ftp:2.11.1 (also observing same behavior with 2.12.1)  When
I inspect the source code I notice two interesting things.
First, the GenericFileConsumer check that determines whether a file is
already inProgress which is called from isValidFile() always adds the file
to the inProgressRepository:
    protected boolean isInProgress(GenericFile<T> file) {
        String key = file.getAbsoluteFilePath();
        return !endpoint.getInProgressRepository().add(key);
    }

Second, if a file is determined to match an entry already present in the
idempotent repository it is discarded (GenericFileConsumer.isValidFile()
returns false).  This means it is never published to an exchange, and thus
never reaches the code which would remove it from the inProgressRepository.

Since the inProgress check happens before the Idempotent Check, we will
always short circuit after we get into the inprogress state, and the file
will never actually be checked again.

Am I reading this code correctly?  Am I missing something here?  This seems
like a bug in the implementation of the isInProgress(GenericFile<T> file)
method to me.



--
View this message in context: 
http://camel.465427.n5.nabble.com/inProgressRepository-Not-clearing-for-items-in-idempotentRepository-tp5742613.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Reply via email to