+1...

Cheers,
Chris

On Nov 10, 2012, at 9:07 AM, Brian Foster wrote:

> Hey Rishi,
> 
> The filemgr connection from the pushpull is just to verify if the filemgr 
> already has a file, so the pushpull doesn't redownload files (no ingest 
> support)... usually you configure your pushpull deamon to run at longer 
> interval times, but the crawler usually will wake up more often (every 30 
> seconds is a typical interval time for it)... so just have the pushpull 
> download its files to a staging area which is the same directory which the 
> crawler is monitoring.
> 
> -brian
> 
> On Nov 09, 2012, at 11:06 AM, "Verma, Rishi (388J)" 
> <[email protected]> wrote:
> 
>> Hey Brian, Shreyl,
>> 
>> Thanks for your input and clarification on this.
>> 
>> Brian - the delegation of duties you described makes sense. Does cas-puspull 
>> have any way to invoke a local crawl process following completion of 
>> downloads? I know it has a filemgr hookup, but I wonder about whether a 
>> crawl process can be invoked following the completion of all file downloads 
>> via pushpull. The alternative way of doing this could, of course, be to 
>> schedule the crawler deamon to run well after the pushpull deamon finishes 
>> its work.
>> 
>> Thanks to both of you for your help!
>> rishi
>> 
>> On Nov 9, 2012, at 10:08 AM, Brian Foster wrote:
>> 
>>> 
>>> Hey Rishi,
>>> 
>>> You will need to use both cas-pushpull and cas-crawler to accomplish this...
>>> 
>>> cas-pushpull: Used to for downloading files from remote sites to you local 
>>> systems... the .tmp files contain cas-pushpull's known metadata and you can 
>>> configure which of the known metadata gets written out or if a .tmp file 
>>> gets created at all... however you can add custom metadata fields to it.
>>> 
>>> cas-crawler: Allows for metadata extraction (custom metadata) from files on 
>>> your local system... and then allows you to ingest them into the filemgr 
>>> (optionally can be turned off)
>>> 
>>> HTH
>>> -brian
>>> 
>>> On Nov 08, 2012, at 06:11 PM, "Verma, Rishi (388J)" 
>>> <[email protected]> wrote:
>>> 
>>>> Hi All -
>>>> 
>>>> I'm wondering if anyone has experience with, or knows the details of how 
>>>> to use custom MetExtractors on products that are remotely downloaded via 
>>>> PushPull. 
>>>> 
>>>> By default, PushPull performs some basic met-extraction and creates a 
>>>> ".tmp" file associated with downloaded products, but I'm wondering whether 
>>>> this met generation step is customizable.
>>>> 
>>>> I've looked through the configuration files (e.g. [1], [2]) as well as the 
>>>> code for PushPull, but I can't seem to locate configuration parameters to 
>>>> support the invocation of custom met extractors on downloaded data.
>>>> 
>>>> If any of you have experience with this, or can point me on where to look, 
>>>> I'd really appreciate it.
>>>> 
>>>> Thanks! 
>>>> Rishi 
>>>> 
>>>> --
>>>> [1] 
>>>> http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/push_pull_framework.properties
>>>>  
>>>> [2] 
>>>> http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/examples/
>> 

Reply via email to