Hi Othman, These exceptions are now coming from file locking and are due to permissions problems. I suggest you go to Zookeeper for file locking.
I am building a 2.8.1 release candidate. When it available for download, I'll send you the URL. Thanks, Karl On Fri, Sep 1, 2017 at 5:27 AM, Beelz Ryuzaki <i93oth...@gmail.com> wrote: > Hi Karl, > > This morning, I have followed the steps you told me to do and I still got > stack traces. I have attached the stack traces as well as the content of my > lib repo and option.env. > I have installed zookeeper and I'm ready to use the zookeeper example. > Could you guide through it? I don't know if I follow the same steps in the > file based example, I may not get stack traces. > > Thanks, > Othman > > On Thu, 31 Aug 2017 at 18:19, Karl Wright <daddy...@gmail.com> wrote: > >> Please do the following: >> >> (0) Shut down all ManifoldCF processes. >> (1) Move poi*.jar from connector-common-lib to lib. >> (2) Move dom4j*.jar from connector-common-lib to lib. >> (3) Move commons-collections4*.jar from connector-common-lib to lib. >> (4) Move xmlbeans*.java from connector-common-lib to lib. >> (5) Move curvesapi*.jar from connector-common-lib to lib. >> (6) Modify your options.env to include all of the jars you moved. >> (7) Start up all ManifoldCF processes. >> (8) If you still get stack traces, please send them to me. >> >> Karl >> >> >> On Thu, Aug 31, 2017 at 12:12 PM, Beelz Ryuzaki <i93oth...@gmail.com> >> wrote: >> >>> Hi Karl, >>> >>> By 'other place', do you mean the \lib repository? If that so, then I >>> have already tried it and it didn't work. >>> >>> Othman. >>> >>> On Thu, 31 Aug 2017 at 18:07, Karl Wright <daddy...@gmail.com> wrote: >>> >>>> Hi Othman, >>>> >>>> I used the java dependency inspector to see what the issue is and it >>>> turns out that poi-ooxml.jar does refer back to poi.jar in the class that >>>> is failing. So you will need to move poi-3.15.jar and >>>> commons-collections4-1.4.jar to the other place as well. >>>> >>>> Let's hope that finally fixes this issue. >>>> >>>> I'm very unhappy about the quality of the POI project code; it is >>>> definitely not using reasonable engineering practices, and I will be >>>> opening a ticket with them. >>>> >>>> Thanks, >>>> Karl >>>> >>>> >>>> On Thu, Aug 31, 2017 at 11:57 AM, Beelz Ryuzaki <i93oth...@gmail.com> >>>> wrote: >>>> >>>>> I'm using the file based example and all the changes you told me to >>>>> do. I reproduced them in the file based example. I'll try to install >>>>> zookeeper and use the zookeeper example. Will I need a configuration to do >>>>> in order to run the zookeeper example ? >>>>> >>>>> Othman. >>>>> >>>>> On Thu, 31 Aug 2017 at 17:46, Karl Wright <daddy...@gmail.com> wrote: >>>>> >>>>>> Are you using the zookeeper example, or the file-based example? >>>>>> >>>>>> If these jars have all been moved, and the options.env includes them, >>>>>> then I have to conclude that Apache POI's pom.xml is incorrect too. It >>>>>> will take a while to figure out what's missing that poi-ooxml.jar needs >>>>>> that is not listed. >>>>>> >>>>>> Karl >>>>>> >>>>>> >>>>>> On Thu, Aug 31, 2017 at 11:39 AM, Beelz Ryuzaki <i93oth...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> All the dependencies you mentioned have already been added in the >>>>>>> options.env.win file in the multiprocess-file-example repository. >>>>>>> >>>>>>> On Thu, 31 Aug 2017 at 17:33, Beelz Ryuzaki <i93oth...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Yes, I added it in the options.env.win file. Should it be the one >>>>>>>> in the multiprocess-zk-example document or multiprocess-file-example ? >>>>>>>> >>>>>>>> On Thu, 31 Aug 2017 at 17:30, Karl Wright <daddy...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> It's not related at all to elasticsearch. >>>>>>>>> Karl >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Aug 31, 2017 at 11:26 AM, Beelz Ryuzaki < >>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Could it be a problem of elasticsearch's version ? I'm actually >>>>>>>>>> using 2.1.0 which is pretty old for this new version of ManifoldCF? >>>>>>>>>> >>>>>>>>>> Othman. >>>>>>>>>> >>>>>>>>>> On Thu, 31 Aug 2017 at 17:23, Beelz Ryuzaki <i93oth...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I moved back both the jars you mentioned and a different is >>>>>>>>>>> showing. You will find the stack trace attached. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Othman >>>>>>>>>>> >>>>>>>>>>> On Thu, 31 Aug 2017 at 17:09, Karl Wright <daddy...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> I've looked at the dependencies; you should not have moved >>>>>>>>>>>> poi-3.15.jar. Please move that back, and >>>>>>>>>>>> commons-collections4-4.1.jar too. >>>>>>>>>>>> >>>>>>>>>>>> You *will* need to move curvesapi-1.04.jar though. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Karl >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Aug 31, 2017 at 11:04 AM, Karl Wright < >>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> If you include poi.jar, then all dependencies of poi.jar must >>>>>>>>>>>>> also be included. This would mean that curvesapi-1.04.jar and >>>>>>>>>>>>> commons-collections4-4.1.jar should also be included. >>>>>>>>>>>>> >>>>>>>>>>>>> Karl >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Aug 31, 2017 at 10:23 AM, Beelz Ryuzaki < >>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Karl, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I added the two jars that you have mentioned and another one >>>>>>>>>>>>>> : poi-3.15.jar . Unfortunately, there is another error showing. >>>>>>>>>>>>>> This time, >>>>>>>>>>>>>> it concerns excel files. You will find attached the stack trace. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:32, Karl Wright <daddy...@gmail.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Othman, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yes, this shows that the jar we moved calls back into >>>>>>>>>>>>>>> another jar, which will also need to be moved. *That* jar has >>>>>>>>>>>>>>> yet another >>>>>>>>>>>>>>> dependency too. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The list of jars is thus extended to include: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> poi-ooxml-3.15.jar >>>>>>>>>>>>>>> dom4j-1.6.1.jar >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:25 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You will find attached the stack trace. My apologies for >>>>>>>>>>>>>>>> the bad quality of the image, I'm doing my best to send you >>>>>>>>>>>>>>>> the stack trace >>>>>>>>>>>>>>>> as I don't have the right to send documents outside the >>>>>>>>>>>>>>>> company. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thank you for your time, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:16, Karl Wright < >>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Once again, I need a stack trace to diagnose what the >>>>>>>>>>>>>>>>> problem is. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 9:14 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Oh, actually it didn't solve the problem. I looked into >>>>>>>>>>>>>>>>>> the log file and saw the following error: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Error tossed : org/apache/poi/POIXMLTypeLoader >>>>>>>>>>>>>>>>>> java.lang.NoClassDefFoundError: org/apache/poi/ >>>>>>>>>>>>>>>>>> POIXMLTypeLoader. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Maybe another jar is missing ? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 15:01, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have tried what you told me to do, and you expected >>>>>>>>>>>>>>>>>>> the crawling resumed. How about the regular expressions? >>>>>>>>>>>>>>>>>>> How can I make >>>>>>>>>>>>>>>>>>> complex regular expressions in the job's paths tab ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thank you very much for your help. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:47, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Ok, I will try it right away and let you know if it >>>>>>>>>>>>>>>>>>>> works. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 14:15, Karl Wright < >>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Oh, and you also may need to edit your options.env >>>>>>>>>>>>>>>>>>>>> files to include them in the classpath for startup. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:53 AM, Karl Wright < >>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> If you are amenable, there is another workaround you >>>>>>>>>>>>>>>>>>>>>> could try. Specifically: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> (1) Shut down all MCF processes. >>>>>>>>>>>>>>>>>>>>>> (2) Move the following two files from >>>>>>>>>>>>>>>>>>>>>> connector-common-lib to lib: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> xmlbeans-2.6.0.jar >>>>>>>>>>>>>>>>>>>>>> poi-ooxml-schemas-3.15.jar >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> (3) Restart everything and see if your crawl resumes. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Please let me know what happens. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:33 AM, Karl Wright < >>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I created a ticket for this: CONNECTORS-1450. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> One simple workaround is to use the external Tika >>>>>>>>>>>>>>>>>>>>>>> server transformer rather than the embedded Tika >>>>>>>>>>>>>>>>>>>>>>> Extractor. I'm still >>>>>>>>>>>>>>>>>>>>>>> looking into why the jar is not being found. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 7:08 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Yes, I'm actually using the latest binary version, >>>>>>>>>>>>>>>>>>>>>>>> and my job got stuck on that specific file. >>>>>>>>>>>>>>>>>>>>>>>> The job status is still Running. You can see it in >>>>>>>>>>>>>>>>>>>>>>>> the attached file. For your information, the job >>>>>>>>>>>>>>>>>>>>>>>> started yesterday. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 13:04, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> It looks like a dependency of Apache POI is >>>>>>>>>>>>>>>>>>>>>>>>> missing. >>>>>>>>>>>>>>>>>>>>>>>>> I think we will need a ticket to address this, if >>>>>>>>>>>>>>>>>>>>>>>>> you are indeed using the binary distribution. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks! >>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 6:57 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually using the binary version. For >>>>>>>>>>>>>>>>>>>>>>>>>> security reasons, I can't send any files from my >>>>>>>>>>>>>>>>>>>>>>>>>> computer. I have copied >>>>>>>>>>>>>>>>>>>>>>>>>> the stack trace and scanned it with my cellphone. I >>>>>>>>>>>>>>>>>>>>>>>>>> hope it will be >>>>>>>>>>>>>>>>>>>>>>>>>> helpful. Meanwhile, I have read the documentation >>>>>>>>>>>>>>>>>>>>>>>>>> about how to restrict the >>>>>>>>>>>>>>>>>>>>>>>>>> crawling and I don't think the '|' works in the >>>>>>>>>>>>>>>>>>>>>>>>>> specified. For instance, I >>>>>>>>>>>>>>>>>>>>>>>>>> would like to restrict the crawling for the >>>>>>>>>>>>>>>>>>>>>>>>>> documents that counts the >>>>>>>>>>>>>>>>>>>>>>>>>> 'sound' word . I proceed as follows: *(SON)* . the >>>>>>>>>>>>>>>>>>>>>>>>>> document is with capital >>>>>>>>>>>>>>>>>>>>>>>>>> letters and I noticed that it didn't take it into >>>>>>>>>>>>>>>>>>>>>>>>>> consideration. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 31 Aug 2017 at 12:40, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> The way you restrict documents with the windows >>>>>>>>>>>>>>>>>>>>>>>>>>> share connector is by specifying information on the >>>>>>>>>>>>>>>>>>>>>>>>>>> "Paths" tab in jobs >>>>>>>>>>>>>>>>>>>>>>>>>>> that crawl windows shares. There is end-user >>>>>>>>>>>>>>>>>>>>>>>>>>> documentation both online and >>>>>>>>>>>>>>>>>>>>>>>>>>> distributed with all binary distributions that >>>>>>>>>>>>>>>>>>>>>>>>>>> describe how to do this. >>>>>>>>>>>>>>>>>>>>>>>>>>> Have you found it? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Aug 31, 2017 at 5:25 AM, Beelz Ryuzaki < >>>>>>>>>>>>>>>>>>>>>>>>>>> i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello Karl, >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you for your response, I will start using >>>>>>>>>>>>>>>>>>>>>>>>>>>> zookeeper and I will let you know if it works. I >>>>>>>>>>>>>>>>>>>>>>>>>>>> have another question to >>>>>>>>>>>>>>>>>>>>>>>>>>>> ask. Actually, I need to make some filters while >>>>>>>>>>>>>>>>>>>>>>>>>>>> crawling. I don't want to >>>>>>>>>>>>>>>>>>>>>>>>>>>> crawl some files and some folders. Could you give >>>>>>>>>>>>>>>>>>>>>>>>>>>> me an example of how to >>>>>>>>>>>>>>>>>>>>>>>>>>>> use the regex. Does the regex allow to use /i to >>>>>>>>>>>>>>>>>>>>>>>>>>>> ignore cases ? >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 19:53, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Beelz, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> File-based sync is deprecated because people >>>>>>>>>>>>>>>>>>>>>>>>>>>>> often have problems with getting file permissions >>>>>>>>>>>>>>>>>>>>>>>>>>>>> right, and they do not >>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand how to shut processes down cleanly, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and zookeeper is resilient >>>>>>>>>>>>>>>>>>>>>>>>>>>>> against that. I highly recommend using zookeeper >>>>>>>>>>>>>>>>>>>>>>>>>>>>> sync. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF is engineered to not put files into >>>>>>>>>>>>>>>>>>>>>>>>>>>>> memory so you do not need huge amounts of memory. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The default values are >>>>>>>>>>>>>>>>>>>>>>>>>>>>> more than enough for 35,000 files, which is a >>>>>>>>>>>>>>>>>>>>>>>>>>>>> pretty small job for >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 11:58 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm actually not using zookeeper. i want to >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> know how is zookeeper different from file based >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sync? I also need a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> guidance on how to manage my pc's memory. How >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> many Go should I allocate for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the start-agent of ManifoldCF? Is 4Go enough in >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> order to crawler 35K files ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 16:11, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Your disk is not writable for some reason, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and that's interfering with ManifoldCF 2.8 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> locking. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would suggest two things: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (1) Use Zookeeper for sync instead of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file-based sync. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (2) Have a look if you still get failures >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> after that. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 9:37 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Mr Karl, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you Mr Karl for your quick response. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have looked into the ManifoldCF log file and >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> extracted the following >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> warnings : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Attempt to set file lock >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'D:\xxxx\apache_manifoldcf-2. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 8\multiprocess-file-example\.\.\synch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> area\569\352\lock-_POOLTARGET_OUTPUTCONNECTORPOOL_ES >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Lowercase) Synapses.lock' failed : Access is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> denied. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Couldn't write to lock file; disk may be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> full. Shutting down process; locks may be left >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dangling. You must cleanup >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> before restarting. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ES (lowercase) synapses being the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> elasticsearch output connection. Moreover, the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> job uses Tika to extract >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> metadata and a file system as a repository >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connection. During the job, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> don't extract the content of the documents. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> was wandering if the issue >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> comes from elasticsearch ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, 30 Aug 2017 at 14:08, Karl Wright < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> daddy...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Othman, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ManifoldCF aborts a job if there's an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error that looks like it might go away on >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> retry, but does not. It can be >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> either on the repository side or on the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output side. If you look at the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Simple History in the UI, or at the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifoldcf.log file, you should be able >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to get a better sense of what went wrong. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Without further information, I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can't say any more. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Karl >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Aug 30, 2017 at 5:33 AM, Beelz >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ryuzaki <i93oth...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm Othman Belhaj, a software engineer >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from société générale in France. I'm >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> actually using your recent version of >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manifoldCF 2.8 . I'm working on an internal >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> search engine. For this reason, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm using manifoldcf in order to index >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> documents on windows shares. I >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> encountered a serious problem while crawling >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 35K documents. Most of the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time, when manifoldcf start crawling a big >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sized documents (19Mo for >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> example), it ends the job with the following >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error: repeated service >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> interruptions - failure processing document >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> : software caused connection >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> abort: socket write error. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Can you give me some tips on how to solve >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this problem, please ? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I use PostgreSQL 9.3.x and elasticsearch >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.1.0 . >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I'm looking forward for your response. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best regards, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Othman BELHAJ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>> >>>>>> >>>> >>