[jira] [Resolved] (CONNECTORS-1554) Job stuck during crawl documents on folder

2018-11-07 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1554.
-
Resolution: Cannot Reproduce

> Job stuck during crawl documents on folder
> --
>
> Key: CONNECTORS-1554
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1554
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Active Directory authority, File system connector, Tika 
> extractor
>Affects Versions: ManifoldCF 2.11
> Environment: Ubuntu Server 18.04
> ManifoldCF 2.11
> Solr 7.5.0
> Tika Server 1.19.1
>Reporter: Mario Bisonti
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: SimpleHistory.png, manifoldcf.log
>
>
> Hallo.
> When I start a job that index a Windows Share, it stucks after a 15 minutes 
> near.
>  
> I see error in ManifoldCF.log as you can see in the attachment
>  
> I attached "Simple History" with the last documents crawled.
> Thanks a lot.
> Mario
> [^manifoldcf.log]!SimpleHistory.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1554) Job stuck during crawl documents on folder

2018-11-07 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678101#comment-16678101
 ] 

Karl Wright commented on CONNECTORS-1554:
-

[~bisontim], there are several approved models under which you can run 
ManifoldCF.  They are each represented by an example directory in the 
distribution.  But the way you propose running everything under Tomcat is not 
one of these.

If you indeed want to run ManifoldCF as a single process (with the pitfalls 
that may have, including issues regarding starvation of UI resources during 
heavy crawling), you can simply deploy the combined ManifoldCF war file.  
Instructions are on the "how to build and deploy" page.


> Job stuck during crawl documents on folder
> --
>
> Key: CONNECTORS-1554
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1554
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Active Directory authority, File system connector, Tika 
> extractor
>Affects Versions: ManifoldCF 2.11
> Environment: Ubuntu Server 18.04
> ManifoldCF 2.11
> Solr 7.5.0
> Tika Server 1.19.1
>Reporter: Mario Bisonti
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: SimpleHistory.png, manifoldcf.log
>
>
> Hallo.
> When I start a job that index a Windows Share, it stucks after a 15 minutes 
> near.
>  
> I see error in ManifoldCF.log as you can see in the attachment
>  
> I attached "Simple History" with the last documents crawled.
> Thanks a lot.
> Mario
> [^manifoldcf.log]!SimpleHistory.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1554) Job stuck during crawl documents on folder

2018-11-07 Thread Mario Bisonti (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677994#comment-16677994
 ] 

Mario Bisonti commented on CONNECTORS-1554:
---

Hallo Karl.

Great news!

I migrated my MCF configuration versus 
/opt/manifoldcf/multiprocess-zk-example-proprietary/
After the configuration, I started manually:

sudo -u tomcat 
/opt/manifoldcf/multiprocess-zk-example-proprietary/runzookeeper.sh

and

sudo -u tomcat 
/opt/manifoldcf/multiprocess-zk-example-proprietary/start-agents.sh

I had initially an heap java memory error so I set the memory for manifoldcf 
from 512m to 2048m  :

sudo nano /opt/manifoldcf/multiprocess-zk-example-proprietary/options.env.unix
-Xms2048m

-Xmx2048m

Now the scan is working from two hours and no more seems to hang!

Probably the filesystem syncronization put into problem as you said, and 
zookeper seems (i across  my fingers:-) ) to work very well!

I would like to understand, if I would like that agent and zookeper start with 
Tomcat and not as manually process how to do this?

I understand that, to start the agent I need to append to the 
/etc/systemd/system/tomcat.service the start of the agent so appending the:
-Dorg.apache.manifoldcf.agents.AgentRun    ?

So, my /etc/systemd/system/tomcat.service  would become:

[Unit]
Description=Apache Tomcat Web Application Container
After=network.target

[Service]
Type=forking

Environment=JAVA_HOME= /usr/lib/jvm/java-1.11.0-openjdk-amd64
Environment=CATALINA_PID=/opt/tomcat/temp/tomcat.pid
Environment=CATALINA_HOME=/opt/tomcat
Environment=CATALINA_BASE=/opt/tomcat
Environment='CATALINA_OPTS=-Xms512M -Xmx1024M -server -XX:+UseParallelGC 
-Dorg.apache.manifoldcf.configfile=/opt/manifoldcf/multiprocess-zk-example-proprietary/properties.xml'
 -Dorg.apache.manifoldcf.agents.AgentRun
Environment='JAVA_OPTS=-Djava.awt.headless=true 
-Djava.security.egd=file:/dev/./urandom'

ExecStart=/opt/tomcat/bin/startup.sh
ExecStop=/opt/tomcat/bin/shutdown.sh

User=tomcat
Group=tomcat
UMask=0007
RestartSec=10
Restart=always

[Install]
WantedBy=multi-user.target

 
Is this right?
If yes, how can I add the start of the zookeper automatically?


Thanks a lot for your great help, Karl!
Best regards.
Mario

> Job stuck during crawl documents on folder
> --
>
> Key: CONNECTORS-1554
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1554
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Active Directory authority, File system connector, Tika 
> extractor
>Affects Versions: ManifoldCF 2.11
> Environment: Ubuntu Server 18.04
> ManifoldCF 2.11
> Solr 7.5.0
> Tika Server 1.19.1
>Reporter: Mario Bisonti
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: SimpleHistory.png, manifoldcf.log
>
>
> Hallo.
> When I start a job that index a Windows Share, it stucks after a 15 minutes 
> near.
>  
> I see error in ManifoldCF.log as you can see in the attachment
>  
> I attached "Simple History" with the last documents crawled.
> Thanks a lot.
> Mario
> [^manifoldcf.log]!SimpleHistory.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)