Hi Othman,

These exceptions are now coming from file locking and are due to
permissions problems.  I suggest you go to Zookeeper for file locking.

I am building a 2.8.1 release candidate.  When it available for download,
I'll send you the URL.

Thanks,
Karl


On Fri, Sep 1, 2017 at 5:27 AM, Beelz Ryuzaki <i93oth...@gmail.com> wrote:

> Hi Karl,
>
> This morning, I have followed the steps you told me to do and I still got
> stack traces. I have attached the stack traces as well as the content of my
> lib repo and option.env.
> I have installed zookeeper and I'm ready to use the zookeeper example.
> Could you guide through it? I don't know if I follow the same steps in the
> file based example, I may not get stack traces.
>
> Thanks,
> Othman
>
> On Thu, 31 Aug 2017 at 18:19, Karl Wright <daddy...@gmail.com> wrote:
>
>> Please do the following:
>>
>> (0) Shut down all ManifoldCF processes.
>> (1) Move poi*.jar from connector-common-lib to lib.
>> (2) Move dom4j*.jar from connector-common-lib to lib.
>> (3) Move commons-collections4*.jar from connector-common-lib to lib.
>> (4) Move xmlbeans*.java from connector-common-lib to lib.
>> (5) Move curvesapi*.jar from connector-common-lib to lib.
>> (6) Modify your options.env to include all of the jars you moved.
>> (7) Start up all ManifoldCF processes.
>> (8) If you still get stack traces, please send them to me.
>>
>> Karl
>>
>>
>> On Thu, Aug 31, 2017 at 12:12 PM, Beelz Ryuzaki <i93oth...@gmail.com>
>> wrote:
>>
>>> Hi Karl,
>>>
>>> By 'other place', do you mean the \lib repository? If that so, then I
>>> have already tried it and it didn't work.
>>>
>>> Othman.
>>>
>>> On Thu, 31 Aug 2017 at 18:07, Karl Wright <daddy...@gmail.com> wrote:
>>>
>>>> Hi Othman,
>>>>
>>>> I used the java dependency inspector to see what the issue is and it
>>>> turns out that poi-ooxml.jar does refer back to poi.jar in the class that
>>>> is failing.  So you will need to move poi-3.15.jar and
>>>> commons-collections4-1.4.jar to the other place as well.
>>>>
>>>> Let's hope that finally fixes this issue.
>>>>
>>>> I'm very unhappy about the quality of the POI project code; it is
>>>> definitely not using reasonable engineering practices, and I will be
>>>> opening a ticket with them.
>>>>
>>>> Thanks,
>>>> Karl
>>>>
>>>>
>>>> On Thu, Aug 31, 2017 at 11:57 AM, Beelz Ryuzaki <i93oth...@gmail.com>
>>>> wrote:
>>>>
>>>>> I'm using the file based example and all the changes you told me to
>>>>> do. I reproduced them in the file based example. I'll try to install
>>>>> zookeeper and use the zookeeper example. Will I need a configuration to do
>>>>> in order to run the zookeeper example ?
>>>>>
>>>>> Othman.
>>>>>
>>>>> On Thu, 31 Aug 2017 at 17:46, Karl Wright <daddy...@gmail.com> wrote:
>>>>>
>>>>>> Are you using the zookeeper example, or the file-based example?
>>>>>>
>>>>>> If these jars have all been moved, and the options.env includes them,
>>>>>> then I have to conclude that Apache POI's pom.xml is incorrect too.  It
>>>>>> will take a while to figure out what's missing that poi-ooxml.jar needs
>>>>>> that is not listed.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 31, 2017 at 11:39 AM, Beelz Ryuzaki <i93oth...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> All the dependencies you mentioned have already been added in the
>>>>>>> options.env.win file in the multiprocess-file-example repository.
>>>>>>>
>>>>>>> On Thu, 31 Aug 2017 at 17:33, Beelz Ryuzaki <i93oth...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yes, I added it in the options.env.win file. Should it be the one
>>>>>>>> in the multiprocess-zk-example document or multiprocess-file-example ?
>>>>>>>>
>>>>>>>> On Thu, 31 Aug 2017 at 17:30, Karl Wright <daddy...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> It's not related at all to elasticsearch.
>>>>>>>>> Karl
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 31, 2017 at 11:26 AM, Beelz Ryuzaki <
>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Could it be a problem of elasticsearch's version ? I'm actually
>>>>>>>>>> using 2.1.0 which is pretty old for this new version of ManifoldCF?
>>>>>>>>>>
>>>>>>>>>> Othman.
>>>>>>>>>>
>>>>>>>>>> On Thu, 31 Aug 2017 at 17:23, Beelz Ryuzaki <i93oth...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I moved back both the jars you mentioned and a different is
>>>>>>>>>>> showing. You will find the stack trace attached.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Othman
>>>>>>>>>>>
>>>>>>>>>>> On Thu, 31 Aug 2017 at 17:09, Karl Wright <daddy...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I've looked at the dependencies; you should not have moved
>>>>>>>>>>>> poi-3.15.jar.  Please move that back, and 
>>>>>>>>>>>> commons-collections4-4.1.jar too.
>>>>>>>>>>>>
>>>>>>>>>>>> You *will* need to move curvesapi-1.04.jar though.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Karl
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Aug 31, 2017 at 11:04 AM, Karl Wright <
>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> If you include poi.jar, then all dependencies of poi.jar must
>>>>>>>>>>>>> also be included.  This would mean that curvesapi-1.04.jar and
>>>>>>>>>>>>> commons-collections4-4.1.jar should also be included.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 10:23 AM, Beelz Ryuzaki <
>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Karl,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I added the two jars that you have mentioned and another one
>>>>>>>>>>>>>> : poi-3.15.jar . Unfortunately, there is another error showing. 
>>>>>>>>>>>>>> This time,
>>>>>>>>>>>>>> it concerns excel files. You will find attached the stack trace.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:32, Karl Wright <daddy...@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Othman,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes, this shows that the jar we moved calls back into
>>>>>>>>>>>>>>> another jar, which will also need to be moved.  *That* jar has 
>>>>>>>>>>>>>>> yet another
>>>>>>>>>>>>>>> dependency too.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The list of jars is thus extended to include:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> poi-ooxml-3.15.jar
>>>>>>>>>>>>>>> dom4j-1.6.1.jar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:25 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You will find attached the stack trace. My apologies for
>>>>>>>>>>>>>>>> the bad quality of the image, I'm doing my best to send you 
>>>>>>>>>>>>>>>> the stack trace
>>>>>>>>>>>>>>>> as I don't have the right to send documents outside the 
>>>>>>>>>>>>>>>> company.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you for your time,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:16, Karl Wright <
>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Once again, I need a stack trace to diagnose what the
>>>>>>>>>>>>>>>>> problem is.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:14 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Oh, actually it didn't solve the problem. I looked into
>>>>>>>>>>>>>>>>>> the log file and saw the following error:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Error tossed : org/apache/poi/POIXMLTypeLoader
>>>>>>>>>>>>>>>>>> java.lang.NoClassDefFoundError: org/apache/poi/
>>>>>>>>>>>>>>>>>> POIXMLTypeLoader.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Maybe another jar is missing ?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:01, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have tried what you told me to do, and you expected
>>>>>>>>>>>>>>>>>>> the crawling resumed. How about the regular expressions? 
>>>>>>>>>>>>>>>>>>> How can I make
>>>>>>>>>>>>>>>>>>> complex regular expressions in the job's paths tab ?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank you very much for your help.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:47, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Ok, I will try it right away and let you know if it
>>>>>>>>>>>>>>>>>>>> works.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:15, Karl Wright <
>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Oh, and you also may need to edit your options.env
>>>>>>>>>>>>>>>>>>>>> files to include them in the classpath for startup.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:53 AM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If you are amenable, there is another workaround you
>>>>>>>>>>>>>>>>>>>>>> could try.  Specifically:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> (1) Shut down all MCF processes.
>>>>>>>>>>>>>>>>>>>>>> (2) Move the following two files from
>>>>>>>>>>>>>>>>>>>>>> connector-common-lib to lib:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> xmlbeans-2.6.0.jar
>>>>>>>>>>>>>>>>>>>>>> poi-ooxml-schemas-3.15.jar
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> (3) Restart everything and see if your crawl resumes.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please let me know what happens.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:33 AM, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I created a ticket for this: CONNECTORS-1450.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> One simple workaround is to use the external Tika
>>>>>>>>>>>>>>>>>>>>>>> server transformer rather than the embedded Tika 
>>>>>>>>>>>>>>>>>>>>>>> Extractor.  I'm still
>>>>>>>>>>>>>>>>>>>>>>> looking into why the jar is not being found.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:08 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Yes, I'm actually using the latest binary version,
>>>>>>>>>>>>>>>>>>>>>>>> and my job got stuck on that specific file.
>>>>>>>>>>>>>>>>>>>>>>>> The job status is still Running. You can see it in
>>>>>>>>>>>>>>>>>>>>>>>> the attached file. For your information, the job 
>>>>>>>>>>>>>>>>>>>>>>>> started yesterday.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 13:04, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> It looks like a dependency of Apache POI is
>>>>>>>>>>>>>>>>>>>>>>>>> missing.
>>>>>>>>>>>>>>>>>>>>>>>>> I think we will need a ticket to address this, if
>>>>>>>>>>>>>>>>>>>>>>>>> you are indeed using the binary distribution.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 6:57 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually using the binary version. For
>>>>>>>>>>>>>>>>>>>>>>>>>> security reasons, I can't send any files from my 
>>>>>>>>>>>>>>>>>>>>>>>>>> computer. I have copied
>>>>>>>>>>>>>>>>>>>>>>>>>> the stack trace and scanned it with my cellphone. I 
>>>>>>>>>>>>>>>>>>>>>>>>>> hope it will be
>>>>>>>>>>>>>>>>>>>>>>>>>> helpful. Meanwhile, I have read the documentation 
>>>>>>>>>>>>>>>>>>>>>>>>>> about how to restrict the
>>>>>>>>>>>>>>>>>>>>>>>>>> crawling and I don't think the '|' works in the 
>>>>>>>>>>>>>>>>>>>>>>>>>> specified. For instance, I
>>>>>>>>>>>>>>>>>>>>>>>>>> would like to restrict the crawling for the 
>>>>>>>>>>>>>>>>>>>>>>>>>> documents that counts the
>>>>>>>>>>>>>>>>>>>>>>>>>> 'sound' word . I proceed as follows: *(SON)* . the 
>>>>>>>>>>>>>>>>>>>>>>>>>> document is with capital
>>>>>>>>>>>>>>>>>>>>>>>>>> letters and I noticed that it didn't take it into 
>>>>>>>>>>>>>>>>>>>>>>>>>> consideration.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 12:40, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman,
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> The way you restrict documents with the windows
>>>>>>>>>>>>>>>>>>>>>>>>>>> share connector is by specifying information on the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> "Paths" tab in jobs
>>>>>>>>>>>>>>>>>>>>>>>>>>> that crawl windows shares.  There is end-user 
>>>>>>>>>>>>>>>>>>>>>>>>>>> documentation both online and
>>>>>>>>>>>>>>>>>>>>>>>>>>> distributed with all binary distributions that 
>>>>>>>>>>>>>>>>>>>>>>>>>>> describe how to do this.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Have you found it?
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 5:25 AM, Beelz Ryuzaki <
>>>>>>>>>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for your response, I will start using
>>>>>>>>>>>>>>>>>>>>>>>>>>>> zookeeper and I will let you know if it works. I 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> have another question to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ask. Actually, I need to make some filters while 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> crawling. I don't want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> crawl some files and some folders. Could you give 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> me an example of how to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> use the regex. Does the regex allow to use /i to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ignore cases ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 19:53, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Beelz,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> File-based sync is deprecated because people
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> often have problems with getting file permissions 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> right, and they do not
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand how to shut processes down cleanly, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and zookeeper is resilient
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> against that.  I highly recommend using zookeeper 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sync.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF is engineered to not put files into
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> memory so you do not need huge amounts of memory. 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>  The default values are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more than enough for 35,000 files, which is a 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty small job for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 11:58 AM, Beelz
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually not using zookeeper. i want to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know how is zookeeper different from file based 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sync? I also need a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> guidance on how to manage my pc's memory. How 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> many Go should I allocate for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the start-agent of ManifoldCF? Is 4Go enough in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> order to crawler 35K files ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 16:11, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your disk is not writable for some reason,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and that's interfering with ManifoldCF 2.8 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> locking.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would suggest two things:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) Use Zookeeper for sync instead of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file-based sync.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Have a look if you still get failures
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after that.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 9:37 AM, Beelz
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mr Karl,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you Mr Karl for your quick response.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have looked into the ManifoldCF log file and 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extracted the following
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> warnings :
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Attempt to set file lock
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'D:\xxxx\apache_manifoldcf-2.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 8\multiprocess-file-example\.\.\synch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> area\569\352\lock-_POOLTARGET_OUTPUTCONNECTORPOOL_ES
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Lowercase) Synapses.lock' failed : Access is 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> denied.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Couldn't write to lock file; disk may be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> full. Shutting down process; locks may be left 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dangling. You must cleanup
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> before restarting.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ES (lowercase) synapses being the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch output connection. Moreover, the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> job uses Tika to extract
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata and a file system as a repository 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connection. During the job, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't extract the content of the documents. I 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was wandering if the issue
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comes from elasticsearch ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 14:08, Karl Wright <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF aborts a job if there's an
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error that looks like it might go away on 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retry, but does not.  It can be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> either on the repository side or on the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output side.  If you look at the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Simple History in the UI, or at the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifoldcf.log file, you should be able
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to get a better sense of what went wrong.  
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Without further information, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can't say any more.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 5:33 AM, Beelz
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <i93oth...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm Othman Belhaj, a software engineer
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from société générale in France. I'm 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actually using your recent version of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifoldCF 2.8 . I'm working on an internal 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> search engine. For this reason,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm using manifoldcf in order to index 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documents on windows shares. I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> encountered a serious problem while crawling 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 35K documents. Most of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time, when manifoldcf start crawling a big 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sized documents (19Mo for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example), it ends the job with the following 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error: repeated service
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interruptions - failure processing document 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> : software caused connection
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> abort: socket write error.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can you give me some tips on how to solve
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this problem, please ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I use PostgreSQL 9.3.x and elasticsearch
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.1.0 .
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm looking forward for your response.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman BELHAJ
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>
>>

Reply via email to