Re: Job Multiple Outputs

2019-09-10 Thread Julien Massiera
of the same document. The framework is smart enough to not hand a document to a connector if it hasn't changed (according to how the connector computes the connector-specific output version string). Karl On Tue, Sep 10, 2019 at 11:00 AM Julien Massiera <mailto:julien.massi...@francelabs.

Job Multiple Outputs

2019-09-10 Thread Julien Massiera
Hi, I would like to have an explanation about the behavior of a job when several outputs are configured. My main question is : for each output, how is the docs ingestion managed ? More precisely, are the ingest processes synchronized or not ? (in other words, is the ingestion of the next

Web connector empty session cookie cache

2019-06-03 Thread Julien Massiera
Hi all, I was doing some tests with the Web connector, and after several tries with different configurations of my job to crawl a session based website, I noticed that one configuration was not working. So I debugged the job and noticed that the connector was using a wrong session cookie. In

Re: Solr examples with long metadata needed

2018-09-26 Thread Julien Massiera
s still around and the metadata can be shared?) Thanks in advance, Karl -- Julien MASSIERA Directeur développement produit France Labs – Les experts du Search Retrouvez-nous à l’Enterprise Search & Discovery Summit à Washington DC www.francelabs.com

Re: User rights for Sharepoint connector

2017-12-23 Thread Julien Massiera
Hi Karl, No problem, it is what I would have proposed anyway ! Julien Le 23/12/2017 à 16:27, Karl Wright a écrit : Do you mind if I include this in the SharePoint connector documentation? Thanks, Karl On Sat, Dec 23, 2017 at 10:13 AM, Julien Massiera <julien.massi...@francelabs.

User rights for Sharepoint connector

2017-12-21 Thread Julien Massiera
rks but I would like to avoid an admin user to crawl my site. Thanks for your help ! Julien Massiera --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus

MCF 2.8.1 agent logs

2017-09-27 Thread Julien Massiera
Hi MCF community, I recently switched from MCF 2.5 to MCF 2.8.1 and I am a little bit confused with the agent logs. First of all I noticed that MCF is now using log4j2 and the documentation is not up to date on this point (the old logging.ini format is mentioned) :

Re: Download MCF 2.8.1

2017-09-12 Thread Julien Massiera
On Tue, Sep 12, 2017 at 5:19 AM, Julien Massiera <julien.massi...@francelabs.com <mailto:julien.massi...@francelabs.com>> wrote: Hi everybody, I usually download MCF through the http://archive.apache.org website and I noticed that the 2.8.1 version is dated

Download MCF 2.8.1

2017-09-12 Thread Julien Massiera
and I can safely download MCF from this website or is it better to take it from the MCF website ? Regards -- Julien MASSIERA Expert en technologies de recherche France Labs – Les experts du Search Vainqueur du challenge Internal Search de EY à Viva Technologies 2016 www.francelabs.com Tel : +33

Re: Delete IDs with JDBC connector

2017-04-27 Thread julien . massiera
right a écrit : > Hi Julien, > > How are you starting the job? If you use "Start minimal", deletion would not > take place. If your job is a continuous one, this is also the case. > > Thanks, > Karl > > On Wed, Apr 26, 2017 at 9:52 AM, <julien.massi...@francelabs.com> wrote: > Hi the MCF community, > > I am using MCF 2.6 with the JDBC connector to crawl an Oracle Database and > index the data into a Solr server, and it works very well. However, when I > perform a delta re-crawl, the new IDs are correctly retrieved from the > Database but those who have been deleted are not "detected" by the connector > and thus, are still present in my Solr index. > I would like to know if normally it should work and that I maybe have missed > something in the configuration of the job, or if this is not implemented ? > The only way I found to solve this issue is to reset the seeding of the job, > but it is very time and resource consuming. > > Best regards, > Julien Massiera Links: -- [1] http://doc.id

Re: Delete IDs with JDBC connector

2017-04-27 Thread julien . massiera
d not > take place. If your job is a continuous one, this is also the case. > > Thanks, > Karl > > On Wed, Apr 26, 2017 at 9:52 AM, <julien.massi...@francelabs.com> wrote: > Hi the MCF community, > > I am using MCF 2.6 with the JDBC connector to crawl an Oracle Database and > index the data into a Solr server, and it works very well. However, when I > perform a delta re-crawl, the new IDs are correctly retrieved from the > Database but those who have been deleted are not "detected" by the connector > and thus, are still present in my Solr index. > I would like to know if normally it should work and that I maybe have missed > something in the configuration of the job, or if this is not implemented ? > The only way I found to solve this issue is to reset the seeding of the job, > but it is very time and resource consuming. > > Best regards, > Julien Massiera

Re: Delete IDs with JDBC connector

2017-04-26 Thread julien . massiera
> Hi Julien, > > How are you starting the job? If you use "Start minimal", deletion would not > take place. If your job is a continuous one, this is also the case. > > Thanks, > Karl > > On Wed, Apr 26, 2017 at 9:52 AM, <julien.massi...@francelabs.com> wrote: > Hi the MCF community, > > I am using MCF 2.6 with the JDBC connector to crawl an Oracle Database and > index the data into a Solr server, and it works very well. However, when I > perform a delta re-crawl, the new IDs are correctly retrieved from the > Database but those who have been deleted are not "detected" by the connector > and thus, are still present in my Solr index. > I would like to know if normally it should work and that I maybe have missed > something in the configuration of the job, or if this is not implemented ? > The only way I found to solve this issue is to reset the seeding of the job, > but it is very time and resource consuming. > > Best regards, > Julien Massiera

Delete IDs with JDBC connector

2017-04-26 Thread julien . massiera
he seeding of the job, but it is very time and resource consuming. Best regards, Julien Massiera

[JCIFS Connector] crawl job stop on access error

2016-12-09 Thread Julien Massiera
eally want to avoid the job to stop if a lock file is encountered and not filtered. Thanks -- Julien MASSIERA Expert en technologies de recherche France Labs – Les experts du Search Vainqueur du challenge Internal Search de EY à Viva Technologies 2016 www.francelabs.com Tel : +33 (0) 663778847

[MCF API] DELETE not available for some connectors

2016-11-28 Thread Julien Massiera
nnections" while the "GET" or "PUT" are available (ref method "executeDeleteCommand" in the class "org/apache/manifoldcf/crawler/system/ManifoldCF"). Is there a specific reason for this ? If not, is it part of your plan to implement it any time soon ? Than

Re: Multiple output documents from one input document in transformation connector

2016-05-20 Thread Julien Massiera
re. Am I missing something ? Julien On 19/05/2016 21:14, Karl Wright wrote: This sounds like it would work. Karl Sent from my Windows Phone From: Julien Massiera Sent: 5/19/2016 12:44 PM To:user@manifoldcf.apache.org Subject: Multiple output documents from one input document in transformation co

Multiple output documents from one input document in transformation connector

2016-05-19 Thread Julien Massiera
the emails and send them for Solr ingestion through the activities object. Is my approach correct ? or do I need to consider another solution ? Thanks for your help. Julien Massiera