How delete unreachable documents on continous crawling?

2014-08-12 Thread Bisonti Mario
Hallo. I set continuous crawling on a folder of a website to index the pdf files contained. Schedule type: Rescan documents dinamically Recrawl interval (if continuous):5 I see that if documents are added on the folder, they are indexed, but if documents are deleted they aren’t deleted from

R: How delete unreachable documents on continous crawling?

2014-08-12 Thread Bisonti Mario
delete unreachable documents on continous crawling? Hi Mario, Please read ManifoldCF in Action Chapter 1. Continuous crawling has no mechanism for deleting unreachable documents, and never will, because it is fundamentally impossible to do. Thanks, Karl On Tue, Aug 12, 2014 at 6:10 AM, Bisonti

R: How delete unreachable documents on continous crawling?

2014-08-12 Thread Bisonti Mario
, Yes, periodic recrawling allows ManifoldCF the opportunity to discover abandoned documents and remove them. Karl On Tue, Aug 12, 2014 at 6:18 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Ok, thanks.. So you suggest to me to not use continuos crawling

R: How delete unreachable documents on continous crawling?

2014-08-12 Thread Bisonti Mario
at the beginning This should give you what you want. Karl On Tue, Aug 12, 2014 at 8:43 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: So , I suppose, the best solution could be : Continous recrawling and one periodic recrawling to delete orphaned documents. Can I

R: How delete unreachable documents on continous crawling?

2014-08-27 Thread Bisonti Mario
code that made byte-rate throttling 1000x too restrictive. This was fixed in 1.6. Karl On Wed, Aug 27, 2014 at 5:38 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Hallo. I increased RAM to 4GB and I execute, manually the job to crawl “Web repository

R: How delete unreachable documents on continous crawling?

2014-08-27 Thread Bisonti Mario
27, 2014 at 6:36 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Thabks a lot. I understood about full crawl vs minimal crawls Third throttling: I set for the web repository connection, throttling = 100 I set for the output connection Solr , Throttling, max

Populate field Solr

2014-08-28 Thread Bisonti Mario
Hallo. I have web repository containing pdf files. So from Manifold I scan that directory and index the output connector : solr I need to populate the field “category” of solr index. I tried to use a job on ManifoldCF to do this. Tab: Forced Metadata Parameter name: category Parameter value:

R: Populate field Solr

2014-08-28 Thread Bisonti Mario
Solr Hi Mario, Can you post the Solr log INFO message for the indexing of the document in question? Thanks, Karl On Thu, Aug 28, 2014 at 11:18 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Hallo. I have web repository containing pdf files. So from Manifold I

R: Populate field Solr

2014-08-29 Thread Bisonti Mario
you cut/paste the data on the view page of your job please? View your job, and then select the output so I can see how everything is configured. Karl On Thu, Aug 28, 2014 at 11:30 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: INFO - 2014-08-28 17:26:47.372

R: Populate field Solr

2014-08-29 Thread Bisonti Mario
: Karl Wright [mailto:daddy...@gmail.com] Inviato: venerdì 29 agosto 2014 13:46 A: Karl Wright; Bisonti Mario; user@manifoldcf.apache.org Oggetto: RE: Populate field Solr Hi Mario, I tried this here on 1.7 and it worked as expected. Please look at your solr field mapping tab. There is a checkbox

Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-15 Thread Bisonti Mario
Hallo. I tried to setup a mysql repository connection but I obtain the error mentioned. I put mysql-connector-java-5.1.32-bin.jar in /apache-manifoldcf-1.7/connector-lib-proprietary and /apache-manifoldcf-1.7/lib-proprietary folder but I obtain the error in the object What could I check?

R: Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-15 Thread Bisonti Mario
, you must build ManifoldCF yourself. Thanks, Karl On Mon, Sep 15, 2014 at 10:14 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: I am on /usr/share/manifoldcf/example/ and I execute: sudo java –jar start.jar Instead mysql-connector-java-5.1.32-bin.jar is in /usr

R: Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-16 Thread Bisonti Mario
the -src and -lib distribution, and then run ant make-deps build, and you should be able to use proprietary MySQL database connections. Thanks, Karl On Mon, Sep 15, 2014 at 10:24 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: I understood. Infact, I haven’t example

R: Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-16 Thread Bisonti Mario
AS $(IDCOLUMN), command_line AS $(URLCOLUMN),object_id AS $(DATACOLUMN) FROM icinga_commands where command_id IN $(IDLIST) What could I check? Thanks a lot Mario Da: Bisonti Mario Inviato: martedì 16 settembre 2014 09:58 A: 'user@manifoldcf.apache.org' Oggetto: R: Connection status:Threw

R: Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-16 Thread Bisonti Mario
at 4:53 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: When I start a test job to extract a table I obtain: Error: Bad seed query; doesn't return $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. $(IDCOLUMN). My configuration: Seeding query

R: Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-16 Thread Bisonti Mario
AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Yes, but I obtained the same error. SELECT command_id AS “$(IDCOLUMN)” FROM icinga_commands I tried the query SELECT command_id AS $(IDCOLUMN) FROM icinga_commands by a MySql Client and it works. Da: Karl Wright

R: Connection status:Threw exception: 'Driver class not found: com.mysql.jdbc.Driver'

2014-09-16 Thread Bisonti Mario
it as a comment to the ticket. Thanks, Karl On Tue, Sep 16, 2014 at 7:59 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Yes, it works, and “ “ aren’t necessary. Note this: from MySql Client SELECT command_id AS $(IDCOLUMN) FROM icinga_commands not work instead SELECT

R: Web crawling , robots.txt and access credentials

2014-09-17 Thread Bisonti Mario
are fetched or not fetched, and why. Karl On Tue, Sep 16, 2014 at 11:04 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Hallo. I would like to crawl some documents in a subfolder of a web site: http://aaa.bb.com/ Structure is: http://aaa.bb.com/ccc/folder1 http

List of file to index or remove to Solr

2014-09-18 Thread Bisonti Mario
Hallo. Scenario: I would like to index a list of file for example: http://aaa.bb.com/ccc/folder1/doc1.pdf http://aaa.bb.com/ccc/folder1/doc2.pdf http://aaa.bb.com/ccc/folder1/doc3.pdf At another day, it could be that I want to remove from indexing for example

R: List of file to index or remove to Solr

2014-09-18 Thread Bisonti Mario
On Thu, Sep 18, 2014 at 5:40 AM, Bisonti Mario mario.biso...@vimar.commailto:mario.biso...@vimar.com wrote: Hallo. Scenario: I would like to index a list of file for example: http://aaa.bb.com/ccc/folder1/doc1.pdf http://aaa.bb.com/ccc/folder1/doc2.pdf http://aaa.bb.com/ccc/folder1/doc3.pdf

Job in aborting status

2018-06-12 Thread Bisonti Mario
Hallo. I have jobs in aborting status and it hangs. I tried to restart manifoldcf, I restarted the machine, but the job hangs in aborting status. Now, I am not able to start every job because they stay in starting status How could I solve it? Thanks.

seeds not working?

2018-06-12 Thread Bisonti Mario
Hallo. I created a job to crawl a site and I want only to crawl subfolder so I used on seeds: http://abc.mydomain.net/intranet/aaa/ But I see that it is crawling even the: http://abc.mydomain.net/intranet/abc/ http://abc.mydomain.net/intranet/abd/ etc. Why this? What have I wrong ? Thanks a

FATAL 2018-06-18T18:29:23,676 (Worker thread '36') - Error tossed: null

2018-06-18 Thread Bisonti Mario
Hallo I configured ManifoldCF 2.10 with Tomcat 9.0.8 and Postgres 9.3 I configured multiprocess-file-example When I create a Job to scan a big Windows share (22000 docs word, pdf, etc,) manifoldcf crash with the message: at

Richiamo: Job in aborting status

2018-06-13 Thread Bisonti Mario
Bisonti Mario desidera richiamare il messaggio Job in aborting status.

R: Job in aborting status

2018-06-13 Thread Bisonti Mario
ly you are not setting that up right either. If you want me to give a further analysis, please provide a thread dump of the manifoldcf process. Karl On Tue, Jun 12, 2018 at 10:38 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: For Job “A” it use as repository “Windows Share” connector an

R: Job in aborting status

2018-06-13 Thread Bisonti Mario
output is quite verbose so clearly you are not setting that up right either. If you want me to give a further analysis, please provide a thread dump of the manifoldcf process. Karl On Tue, Jun 12, 2018 at 10:38 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: For Job “A”

R: Job in aborting status

2018-06-12 Thread Bisonti Mario
with kill -9. To clean this up, you need to perform the lock-clean procedure: (1) Shut down all manifoldcf processes (2) Execute the lock-clean script (3) Start up the manifoldcf processes Thanks, Karl On Tue, Jun 12, 2018 at 7:11 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hal

R: Job in aborting status

2018-06-12 Thread Bisonti Mario
208d5d06fe5ce%7Ca1f008bcd59b4c668f8760fd9af15c7f%7C1=yvKCChtnl8h0pK6A6%2BrZURckQz41DCQreng9XJbiVzQ%3D=0> On Tue, Jun 12, 2018 at 10:03 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: I setup jobs : Job “A” to crawls “Windows Shares” Job “B” to crawl my internal site The problem was w

R: Job in aborting status

2018-06-12 Thread Bisonti Mario
environment, you have the following choices: (1) standalone HSQLDB (2) postgresql (3) mysql Thanks, Karl On Tue, Jun 12, 2018 at 9:06 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Thanks Karl. I tried to execute lock-clean from my example directory after I stop manifoldcf but I

connectors.xml modified: new repository not in the list

2018-06-15 Thread Bisonti Mario
Hallo. I installed ManifoldCF on Tomcat and with postgres and I am configuring for use with folder /manifoldcf/multiprocess-file-example Now I would like to add repository "Windows Shares" so I decommented in connectors.xml: And I added on connector-lib-proprietary jcifs-1.3.19.jar I

R: connectors.xml modified: new repository not in the list

2018-06-15 Thread Bisonti Mario
Mario, Your jcifs is named jcifs.jar or jcifs-1.3.19.jar ? What is your ManifoldCF version ? Maxence, De : Bisonti Mario [mailto:mario.biso...@vimar.com] Envoyé : vendredi 15 juin 2018 11:39 À : user@manifoldcf.apache.org<mailto:user@manifoldcf.apache.org> Objet : connectors.xml modifie

R: connectors.xml modified: new repository not in the list

2018-06-15 Thread Bisonti Mario
I solved! I executed: sudo ./initialize.sh And connectors have been refreshed! Thanks! Da: Bisonti Mario Inviato: venerdì 15 giugno 2018 11:46 A: user@manifoldcf.apache.org Oggetto: R: connectors.xml modified: new repository not in the list I leave the name jcifs-1.3.19.jar without rename

script to schedule MCF Jobs by crontab login unauthorized

2018-06-19 Thread Bisonti Mario
Hallo, I used a script to start remotely a job from crontab on MCF 2.9.1 and it worked The sam script, now, in MCF 2.10 not ork. Now, I tried this command: curl -c "cookie" -XPOST 'http://localhost:8080/mcf-api-service/json/LOGIN' -d @/SCRIPTS/user.json wher user.json: { "user":"admin",

Webdav Repository

2018-06-21 Thread Bisonti Mario
Hallo. Is it possible to scan a remote webdav repository? I don’t find any info about it Thanks a lot Mario

R: FATAL 2018-06-18T18:29:23,676 (Worker thread '36') - Error tossed: null

2018-06-19 Thread Bisonti Mario
ng this. Try to find the filanem it crashes on and copy that to asmall crawl directory. Repeat crawl. On Mon, Jun 18, 2018 at 11:34 AM, Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo I configured ManifoldCF 2.10 with Tomcat 9.0.8 and Postgres 9.3 I configured multiprocess-fil

R: FATAL 2018-06-18T18:29:23,676 (Worker thread '36') - Error tossed: null

2018-06-19 Thread Bisonti Mario
otr%2B8KS4wjHws3sKU79U9xgrWwfSAS4qK7Y%2F7NI%3D=0> . Karl On Tue, Jun 19, 2018 at 3:35 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. Note that I specified the mime types on my solr output connection Furthermore, I used the binary distribution, how cold I path it with tour fix? I read on my job

Job stuck internal http error 500

2018-07-27 Thread Bisonti Mario
Hallo. My job is stucking indexing an xlsx file of 38MB What could I do to solve my problem? In the following there is the error: 2018-07-27 08:55:15.562 WARN (qtp1521083627-52) [ x:core_share] o.e.j.s.HttpChannel /solr/core_share/update/extract java.lang.OutOfMemoryError at

Re: Solr connection, max connections and CPU

2018-07-27 Thread Bisonti Mario
you can use at the same time in Solr.> > > Karl> > > On Thu, Jul 26, 2018 at 9:22 AM Bisonti Mario > > wrote:> > > > Hallo, I setup solr connection in the "Output connections" of Manifold> > >> > >> > >> > > I

R: Job stuck internal http error 500

2018-07-27 Thread Bisonti Mario
, 3:04 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. My job is stucking indexing an xlsx file of 38MB What could I do to solve my problem? In the following there is the error: 2018-07-27 08:55:15.562 WARN (qtp1521083627-52) [ x:core_share] o.e.j.s.HttpChannel /solr/core

R: Job stuck internal http error 500

2018-07-27 Thread Bisonti Mario
On Fri, Jul 27, 2018, 3:04 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. My job is stucking indexing an xlsx file of 38MB What could I do to solve my problem? In the following there is the error: 2018-07-27 08:55:15.562 WARN (qtp1521083627-52) [ x:core_share] o.e.j.s

Solr connection, max connections and CPU

2018-07-26 Thread Bisonti Mario
Hallo, I setup solr connection in the "Output connections" of Manifold I don't understand if there is a relation between "Max Connections" and the number of CPUs in the host. Could you help me ti understand it? Thanks a lot Mario

R: Different time in Simple History Report

2018-08-10 Thread Bisonti Mario
:47 A: user@manifoldcf.apache.org Oggetto: Re: Different time in Simple History Report Did you first do: ant make-core-deps ? Karl On Fri, Aug 10, 2018 at 5:04 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Thanks Karl. I tried to compile the trunk version but I obtian: Bui

Different time in Simple History Report

2018-08-09 Thread Bisonti Mario
Hallo I see a difference from the start time in “Simple History Report” It seems late of 2 hours. Have I to set timezone for this report? Thanks a lot See the attachment [cid:image003.jpg@01D42FFB.CFEC1780]

R: Different time in Simple History Report

2018-08-14 Thread Bisonti Mario
2018 17:23 A: user@manifoldcf.apache.org Oggetto: Re: Different time in Simple History Report Try it now. Karl On Fri, Aug 10, 2018 at 10:57 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Yes sudo ant make-core-deps Buildfile: /home/administrator/mcfsorce/trunk/build.xml

R: Different time in Simple History Report

2018-08-14 Thread Bisonti Mario
in the browser timezone? Or all times displayed in the server timezone? Karl On Tue, Aug 14, 2018 at 5:13 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. I compiled, but with this version I see the time 2 hour less of the right time and the report seems wrong time by the

R: Different time in Simple History Report

2018-08-14 Thread Bisonti Mario
at all. The only change was in how the report data was presented. Can you please check your browser time? Karl On Tue, Aug 14, 2018 at 6:13 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hi Karl. In my environment, browser time and server timezone are the same.

R: Different time in Simple History Report

2018-08-10 Thread Bisonti Mario
Locale.ROOT); c.setTimeInMillis(time); // We want to format this string in a compact way: // mm-dd- hh:mm:ss.mmm <<<<<< As you see, formerly the timezone was local time. The change required an explicit timezone in order to pass the forbidden APIs test, and

R: Different time in Simple History Report

2018-08-10 Thread Bisonti Mario
is in *server* timezone. That accounts for the difference. Karl On Thu, Aug 9, 2018 at 10:23 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo I see a difference from the start time in “Simple History Report” It seems late of 2 hours. Have I to set timezone for this report? Thanks

R: Job stuck internal http error 500

2018-08-08 Thread Bisonti Mario
@manifoldcf.apache.org Oggetto: Re: Job stuck internal http error 500 I am afraid you will need to open a Tika ticket, and be prepared to attach your file to it. Thanks, Karl On Fri, Jul 27, 2018 at 6:04 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: It isn’t a memory problem becau

R: Different time in Simple History Report

2018-08-14 Thread Bisonti Mario
that insures that all times displayed in reports are in the browser client timezone. The same timezone is used throughout. Hopefully this will clear up any remaining confusion. Karl On Tue, Aug 14, 2018 at 6:33 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: This from

R: Different time in Simple History Report

2018-08-14 Thread Bisonti Mario
@manifoldcf.apache.org Oggetto: Re: Different time in Simple History Report It does not look at all like you have properly built with the changed source code. Karl On Tue, Aug 14, 2018 at 9:51 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: I am not able to check th

R: Job stuck internal http error 500

2018-08-08 Thread Bisonti Mario
@manifoldcf.apache.org Oggetto: Re: Job stuck internal http error 500 Thanks for the update! Did the Tika people say when 1.19 will be released? Karl On Wed, Aug 8, 2018 at 8:29 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo You had right, Karl. I have been helped by the tika

R: How to set Tika with ManifoldCF and Solr

2018-10-12 Thread Bisonti Mario
cannot reproduce your problem. Perhaps you can download a new instance and configure it from scratch using the embedded tika? If that works it should be possible to figure out what the difference is. Karl On Thu, Oct 11, 2018, 12:23 PM Bisonti Mario mailto:mario.biso...@vimar.com>> wr

Add field to Output Solr

2018-10-16 Thread Bisonti Mario
Hallo I am using Tika server as processor of file pdf, doc, etc I configured: [cid:image003.png@01D4653C.61DD4040] In my solr output connection, so, when I index the documents I see the field: id last_modified resourcename content_type allow_token_document deny_token_document allow_token_share

R: Add field to Output Solr

2018-10-16 Thread Bisonti Mario
, Oct 16, 2018 at 4:38 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo I am using Tika server as processor of file pdf, doc, etc I configured: [cid:image003.png@01D4653C.61DD4040] In my solr output connection, so, when I index the documents I see the field: id last_mo

R: Job stuck without message

2018-10-30 Thread Bisonti Mario
or a while before it gives up on it. It appears to be stuck but it is not. You can verify that by looking at the Document Queue report to see what is queued and what times the various documents will be retried. Karl On Tue, Oct 30, 2018 at 5:07 AM Bisonti Mario mailto:mario.biso...@vimar.c

R: Job stuck without message

2018-10-30 Thread Bisonti Mario
is unhappy. If the failure is something that indicates that the document is never going to be readable, that's a different problem and we might need to address that in the connector. Karl On Tue, Oct 30, 2018 at 10:33 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Thanks a lot Kar

Job stuck without message

2018-10-30 Thread Bisonti Mario
Hallo. I started a job that works for some minutes, and after it stucks. In the manifoldcf.log I see: at org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:627) [mcf-jcifs-connector.jar:?] at

Error Job stop after repeatidly interruption

2018-11-08 Thread Bisonti Mario
Hallo. I am trying to index more than 500 documents in a Windows Share. It happens that job is interrupted due to repeatidly interruption. This is the manifold.log: . . WARN 2018-11-07T21:53:25,296 (Worker thread '59') - Service interruption reported for job 1533797717712 connection

R: Job stuck without message

2018-11-06 Thread Bisonti Mario
) get put into a "ready for processing" state which don't have any document priority set. But this should have been addressed, certainly, by the most recent release and probably by 2.10 as well. Karl On Tue, Nov 6, 2018 at 5:43 AM Bisonti Mario mailto:mario.biso...@vimar.com>> w

R: Job stuck without message

2018-11-06 Thread Bisonti Mario
it if you can look at the simple history for one of these documents; I need to see what happened to it last. Thanks, Karl On Tue, Nov 6, 2018 at 7:32 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: My version is 2.11 Da: Karl Wright mailto:daddy...@gmail.com>> Inviato: marted

How to set Tika with ManifoldCF and Solr

2018-10-11 Thread Bisonti Mario
Hallo. I would like to use Tika server started from command line into ManifoldCF so, ManifoldCF as Trasformation connector, process with Tika and index to the output connecto Solr. I started Tika server: java -jar /opt/tika/tika-server-1.19.1.jar After, I created a transformation connection

R: How to set Tika with ManifoldCF and Solr

2018-10-11 Thread Bisonti Mario
cting update handler", and you want to change the output handler from "/update/extract" to just "/update". Karl On Thu, Oct 11, 2018 at 4:45 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. I would like to use Tika server started from command

R: How to set Tika with ManifoldCF and Solr

2018-10-11 Thread Bisonti Mario
ts aren't getting indexed. Thanks, Karl On Thu, Oct 11, 2018 at 7:10 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Thanks Karl. I tried, but it doesn’t index documents. It seemes that it doesn’t see them? Perhaps is the “Ignore Tika exception that I don’t know where to se

R: How to set Tika with ManifoldCF and Solr

2018-10-11 Thread Bisonti Mario
My mistake… As you wrote me I had to uncheck “use extracting update handler” Now I have to understand the field mentioned in schema etc. Da: Bisonti Mario Inviato: giovedì 11 ottobre 2018 13:45 A: user@manifoldcf.apache.org Oggetto: R: How to set Tika with ManifoldCF and Solr I see the job

R: How to set Tika with ManifoldCF and Solr

2018-10-11 Thread Bisonti Mario
when I used Tika inside solr. Could you help me? Thanks Da: Bisonti Mario Inviato: giovedì 11 ottobre 2018 14:03 A: user@manifoldcf.apache.org Oggetto: R: How to set Tika with ManifoldCF and Solr My mistake… As you wrote me I had to uncheck “use extracting update handler” Now I have to understand

R: How to set Tika with ManifoldCF and Solr

2018-10-11 Thread Bisonti Mario
herwise, I wonder if there's a problem with the external Tika extractor. Perhaps you can try the internal one to get your pipeline working first? If the external one does not send the right mime type, then we need to correct that so you should open a ticket. Thanks, Karl On Thu,

R: How to notify mail by SMTP

2019-01-15 Thread Bisonti Mario
jobstatuses.txt | wc -l) # If number of job in “Done” status are not equal to the number of job, I send an email if [ $numberdone -ne $numberjobs ] then echo "There are Manifold Job non completed" | mail -s "Job MCF non completed" mailrecei...@domain.net fi ex

R: External Tika Server

2018-12-04 Thread Bisonti Mario
In my tika server, I added: -spawnChild -taskTimeoutMillis 100 To bypass the timeout problem Mario Da: Furkan KAMACI Inviato: martedì 4 dicembre 2018 10:16 A: user@manifoldcf.apache.org; Rafa Haro Oggetto: Re: External Tika Server Hi Rafa, I can parse same document via HTTP URL of Tika

R: External Tika Server

2018-12-05 Thread Bisonti Mario
.parse(PDFParser.java:172) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... 42 more INFO tika (application/pdf) WARN No Unicode mapping for arrowhookright (45) in font LSUPIB+CMMI10 On Tue, Dec 4, 2018 at 3:36 PM Bisonti Mario mailto:mari

How to notify mail by SMTP

2018-12-06 Thread Bisonti Mario
Hallo. I would like to notify by mail the end of a job. I use an smtp server but I am not able how to configure this. I read https://lists.apache.org/list.html?user@manifoldcf.apache.org:dfr=2016-4-1|dto=2019-4-30:smtp but I understand that there is no way to configure with smtp now, isn’t it’

R: How to notify mail by SMTP

2018-12-06 Thread Bisonti Mario
Mario, there is an email notification connector. Have you tried to configure that? On Thu, Dec 6, 2018, 3:50 AM Bisonti Mario mailto:mario.biso...@vimar.com> wrote: Hallo. I would like to notify by mail the end of a job. I use an smtp server but I am not able how to configure this. I r

R: Job stuck without message

2018-11-28 Thread Bisonti Mario
Inviato: martedì 6 novembre 2018 15:27 A: user@manifoldcf.apache.org Oggetto: Re: Job stuck without message I added a couple of questions to the ticket. Please reply. Thanks, Karl On Tue, Nov 6, 2018 at 8:56 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Thanks a lot, K

R: Job stuck without message

2018-11-28 Thread Bisonti Mario
e and provide a row that corresponds to one of these documents? Thanks, Karl On Wed, Nov 28, 2018 at 10:26 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. Repository has Max connection=10 In the Document Status report” I see many item with : State=“Not yet processed” Status=”Ready for

R: Job slower

2019-01-28 Thread Bisonti Mario
I could think to execute daily “vacuumdb --all –analyze” if it could help me. Da: Karl Wright Inviato: venerdì 25 gennaio 2019 17:39 A: user@manifoldcf.apache.org Oggetto: Re: Job slower Did you try 'vacuum full'? Karl On Fri, Jan 25, 2019 at 3:47 AM Bisonti Mario mailto:mario.b

R: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-26 Thread Bisonti Mario
, it will not contain the jtds jar. Since you aren't seeing this error in the log for the java-agents process, I bet that's what you did. Karl On Tue, Feb 26, 2019 at 2:55 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hi Karl. The only message that I obtain from the UI web int

R: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Bisonti Mario
into the classpath, which should happen if you use the startup scripts. Please review the "how-to-build-and-deploy" page. Karl On Wed, Feb 20, 2019 at 9:09 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo, I would like to use MSSQL as repository. I use /opt/manifol

R: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Bisonti Mario
? and what process are you seeing the error from? You should *not* need to make any changes to the configuration if you put the jar file in place before building. Karl On Wed, Feb 20, 2019 at 9:47 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Thanks, Karl but I didn’t dow

Postgres db maintenance

2019-02-08 Thread Bisonti Mario
Hallo. I noted that my postgres dbname is 28GB Is there a way to clean old data or do I need to maintan all data in my db ? Thanks a lot Mario

R: Job hang in aborting state for along time

2019-02-11 Thread Bisonti Mario
Today I migrated from postgres 9.3 to postgres 11.1 and it is working well. I use MCF 2.12 Mario Da: Karl Wright Inviato: lunedì 11 febbraio 2019 13:29 A: user@manifoldcf.apache.org Oggetto: Re: Job hang in aborting state for along time I know that 9.x works properly. I expect later

Job slower

2019-01-25 Thread Bisonti Mario
Hallo. I use MCF 2.12 and postgresql 9.3.25 Solr 7.6 Tika 1.19 on Ubuntu Server 18.04 Weekly I scheduled by crontab for the user postgres : 15 8 * * Sun vacuumdb --all --analyze 20 10 * * Sun reindexdb postgres 25 10 * * Sun reindexdb dbname I see that the job that indexes 70 documents

Documentum connection not working

2019-07-16 Thread Bisonti Mario
Hallo. I am using MCF 2.12 I would like to create a Repository connection to a Documentum Docbase I obtain always the error: Connection temporarily failed: Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused (Connection refused) I don’t

R: Documentum connection not working

2019-07-16 Thread Bisonti Mario
19 at 6:12 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. I am using MCF 2.12 I would like to create a Repository connection to a Documentum Docbase I obtain always the error: Connection temporarily failed: Connection refused to host: 127.0.0.1; n

R: Manifold with OpenJDK

2019-10-17 Thread Bisonti Mario
Hallo, I use Ubuntu 18.04.02 LTS with: openjdk version "11.0.4" 2019-07-16 And I have no issue with ManifoldCF Mario Da: Markus Schuch Inviato: giovedì 17 ottobre 2019 07:35 A: user@manifoldcf.apache.org; Praveen Bejji Oggetto: Re: Manifold with OpenJDK Hi Praveen, we use openjdk 8 in

R: Memory problem on Agent ?

2020-10-02 Thread Bisonti Mario
Java more memory than your machine has. This will not work. Karl On Fri, Oct 2, 2020 at 10:45 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. When I scan the content of Repository , I note that memory used is very high and it isn’t released i.e. 60GB on 70GB availa

Memory problem on Agent ?

2020-10-02 Thread Bisonti Mario
Hallo. When I scan the content of Repository , I note that memory used is very high and it isn't released i.e. 60GB on 70GB available I tried to free shutting down tjhe agent but I am not able: /opt/manifoldcf/multiprocess-zk-example-proprietary/stop-agents.sh OpenJDK 64-Bit Server VM

R: Memory problem on Agent ?

2020-10-05 Thread Bisonti Mario
the remainder among your Java processes. Thanks, Karl On Fri, Oct 2, 2020 at 11:21 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Yes, buti t seems that, when the indexing finished, the memory is not released Da: Karl Wright mailto:daddy...@gmail.com>> Inviato: venerdì 2 ottobr

R: Job interrupted

2020-08-26 Thread Bisonti Mario
il.com>> wrote: Ok, then let me examine the code and see why it's not catching it. Karl On Mon, Aug 24, 2020 at 8:49 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Yes, I see only that exception inside the manifoldcf.log and the job stops with: Error: Repeated service

How to reset job status

2020-08-19 Thread Bisonti Mario
Hallo I have a job in a status “End notification” that hangs on this state. Is there a way to reset it? I tried the script lock-clean.sh without effect. In thise state I am not able to manage jobs. What

R: How to reset job status

2020-08-19 Thread Bisonti Mario
w is failing. If it is failing you would see log output. Do you see log output? Karl On Wed, Aug 19, 2020 at 5:40 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: No, I haven’t a notification connector, buti it isn’t the problem. Manifoldcf.log is empty The problemi s that job is on hanging

Job interrupted

2020-08-24 Thread Bisonti Mario
Hallo. I have some problems about job interrupted. The job execute a windows share scan After many errors, sometimes it stops I see in the manifoldcf.log many errors: at

R: Job interrupted

2020-08-24 Thread Bisonti Mario
interrupted Hi, That's a warning. The job will keep running and the document will be retried later. Karl On Mon, Aug 24, 2020 at 5:24 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo. I have some problems about job interrupted. The job execute a windows share scan Afte

R: Job interrupted

2020-08-24 Thread Bisonti Mario
for that, I believe. Karl On Mon, Aug 24, 2020 at 5:55 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Yes, but after I obtain: Error: Repeated service interruptions - failure processing document: The process cannot access the file because it is being used by another p

R: How to reset job status

2020-08-19 Thread Bisonti Mario
to confirm. Karl On Wed, Aug 19, 2020 at 4:56 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hallo I have a job in a status “End notification” that hangs on this state. Is there a way to reset it? I tried the script lock-clean.sh without effect. In thise state I am no

Error: Repeated service interruptions - failure processing document: Read timed out

2021-09-28 Thread Bisonti Mario
Hello I have error on a Job that parses a network folder. This is the tika error: 2021-09-28 16:14:50 INFO Server:415 - Started @1367ms 2021-09-28 16:14:50 WARN ContextHandler:1671 - Empty contextPath 2021-09-28 16:14:50 INFO ContextHandler:916 - Started

R: Error: Repeated service interruptions - failure processing document: Read timed out

2021-09-30 Thread Bisonti Mario
Additional info. I am using 2.17-dev version Da: Bisonti Mario Inviato: martedì 28 settembre 2021 17:01 A: user@manifoldcf.apache.org Oggetto: Error: Repeated service interruptions - failure processing document: Read timed out Hello I have error on a Job that parses a network folder

Documentation issue?

2023-09-14 Thread Bisonti Mario
Hi, I would like to report that at the url: https://manifoldcf.apache.org/release/release-2.25/en_US/index.html I obtain: Not Found The requested URL was not found on this server. Thank you Mario

web crawler https

2023-09-25 Thread Bisonti Mario
Hi, I would like to try indexing a Wordpress internal site. I tried to configure Repository Web, Job with seeds but I always obtain: WARN 2023-09-25T16:31:50,905 (Worker thread '4') - Service interruption reported for job 1695649924581 connection 'Wp': IO exception

R: web crawler https

2023-09-26 Thread Bisonti Mario
generally debug what certs a site might need by trying to fetch a page with curl and using verbose debug mode. Karl On Mon, Sep 25, 2023 at 10:48 AM Bisonti Mario mailto:mario.biso...@vimar.com>> wrote: Hi, I would like to try indexing a Wordpress internal site. I tried to configure Rep

JCIFS: Possibly transient exception detected on attempt 1 while getting share security: All pipe instances are busy

2023-01-18 Thread Bisonti Mario
Hi. Often, I obtain the error: WARN 2023-01-18T06:18:19,316 (Worker thread '89') - JCIFS: Possibly transient exception detected on attempt 1 while getting share security: All pipe instances are busy. jcifs.smb.SmbException: All pipe instances are busy. at

  1   2   >