[
https://issues.apache.org/jira/browse/CONNECTORS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1522:
Fix Version/s: ManifoldCF 2.12
> Add SSL trust certificates list to ElasticSea
[
https://issues.apache.org/jira/browse/CONNECTORS-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned CONNECTORS-1522:
---
Assignee: Karl Wright
> Add SSL trust certificates list to ElasticSea
[
https://issues.apache.org/jira/browse/TIKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574706#comment-16574706
]
Karl Wright commented on TIKA-2693:
---
Re: testing: I don't have a test setup here, and the user
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574676#comment-16574676
]
Karl Wright commented on CONNECTORS-1490:
-
[~piergiorgioluc...@gmail.com], it ran correctly
There is no autovacuum for MySQL. MySQL apparently does dead tuple cleanup
as it goes.
Karl
On Thu, Aug 9, 2018 at 6:13 AM Gustavo Beneitez
wrote:
> Hi,
>
> looking at the manifoldCF pom I can see
>
> 1.0.4-SNAPSHOT
>
> I'm not aware of any change in database, in fact ours is MySQL, I don't
>
[
https://issues.apache.org/jira/browse/LUCENE-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned LUCENE-8451:
---
Assignee: Karl Wright
> GeoPolygon test fail
[
https://issues.apache.org/jira/browse/LUCENE-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574582#comment-16574582
]
Karl Wright commented on LUCENE-8451:
-
[~ivera], I won't have any possibility of looking
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574570#comment-16574570
]
Karl Wright commented on CONNECTORS-1490:
-
Hi [~piergiorgioluc...@gmail.com], we have
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574552#comment-16574552
]
Karl Wright commented on CONNECTORS-1490:
-
Ok, thanks. I'm going to try running the IT from
[
https://issues.apache.org/jira/browse/TIKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573935#comment-16573935
]
Karl Wright commented on TIKA-2693:
---
I am currently with my wife in the emergency room, so trying things
h during this week but I don't know if we have time to directly bring
> it in this release.
>
> In September I hope to bring the new website and Alfresco BFSI and then the
> Azure Storage connectors.
>
> Cheers,
> PJ
>
> Il giorno mar 7 ago 2018 alle ore 14:05 Karl Wright
[
https://issues.apache.org/jira/browse/TIKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573198#comment-16573198
]
Karl Wright commented on TIKA-2693:
---
I am being clobbered with Tika/POI issues at the moment so I'm
allation and the problem was solved!
>
>
>
> Now I solved using the tika 1.19 versions nightly build.
>
>
>
>
>
> Thanks a lot.
>
>
>
>
>
>
>
> *Da:* Karl Wright
> *Inviato:* venerdì 27 luglio 2018 12:39
> *A:* user@manifoldcf.apache.org
> *Og
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573002#comment-16573002
]
Karl Wright commented on CONNECTORS-1521:
-
There is one hacky approach that would certainly
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572995#comment-16572995
]
Karl Wright commented on CONNECTORS-1521:
-
I'm afraid I don't have time to even contemplate
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572690#comment-16572690
]
Karl Wright commented on CONNECTORS-1490:
-
The only remaining issue is how the tests are run
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572688#comment-16572688
]
Karl Wright commented on CONNECTORS-1490:
-
ok, moved.
> GSOC: MongoDB Output Connec
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571798#comment-16571798
]
Karl Wright commented on CONNECTORS-1490:
-
Also, build.xml has the following:
{code
[
https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571787#comment-16571787
]
Karl Wright commented on CONNECTORS-1490:
-
What was the final decision about what version
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571629#comment-16571629
]
Karl Wright commented on CONNECTORS-1521:
-
{quote}
As far as I can see none of the patterns
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571624#comment-16571624
]
Karl Wright commented on CONNECTORS-1521:
-
[~jamesthomas], computing a date relative to &quo
When will it be ready for integration?
Karl
On Tue, Aug 7, 2018 at 7:10 AM Irindu Nugawela wrote:
> Hi Karl,
>
> I am currently preparing the patch for mcf-mongodb-output-connector. I
> would be glad if we can include it in the next release.
>
> On Mon, 6 Aug 2018 at 16:00,
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571462#comment-16571462
]
Karl Wright edited comment on CONNECTORS-1521 at 8/7/18 11:05 AM
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571462#comment-16571462
]
Karl Wright commented on CONNECTORS-1521:
-
All I have access to indicates that IDfTime
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571444#comment-16571444
]
Karl Wright commented on CONNECTORS-1521:
-
The method that is used to build the date string
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned CONNECTORS-1521:
---
Assignee: Karl Wright
> Documentum Connector users ManifoldCF's local t
[
https://issues.apache.org/jira/browse/CONNECTORS-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571312#comment-16571312
]
Karl Wright commented on CONNECTORS-1492:
-
[~piergiorgioluc...@gmail.com], I suspect that we
the first place, or
> it is just to be expected with the nature of the multiple worker threads
> and the query types issued by ManifoldCF?
>
> Best Regards,
>
>
>
> Guy
>
>
>
> *From:* Karl Wright [mailto:daddy...@gmail.com]
> *Sent:* 06 August 2018 12:16
> *To:* user@
PDATE
>
> 2018-08-03 15:52:42.855 BST [5716] ERROR: could not serialize access due
> to concurrent update
>
> “
>
>
>
> These errors don’t suggest a retry may sort them out - is this an issue?
>
>
>
> Many Thanks,
>
>
>
> Guy
>
>
>
> *Fr
I'm hoping to cut RC0 of 2.11 around August 15th. Any objection?
Karl
[
https://issues.apache.org/jira/browse/LUCENE-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned LUCENE-8444:
---
Assignee: Ignacio Vera (was: Karl Wright)
> Geo3D Test Failure: Test Point is Contai
[
https://issues.apache.org/jira/browse/LUCENE-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16570011#comment-16570011
]
Karl Wright commented on LUCENE-8444:
-
[~ivera] That sounds like the proper fix then. It's exactly
[
https://issues.apache.org/jira/browse/LUCENE-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569976#comment-16569976
]
Karl Wright commented on LUCENE-8444:
-
[~ivera], identical cutoff planes are bad news.
If we detect
[
https://issues.apache.org/jira/browse/LUCENE-8444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned LUCENE-8444:
---
Assignee: Karl Wright
> Geo3D Test Failure: Test Point is Contained by shape but outs
2018-08-03 15:52:25.218 BST [4140] HINT: The transaction might succeed if
retried.
<<<<<<
... occur because of concurrent transactions. The transaction is indeed
retried when this occurs, so unless your job aborts, you are fine.
Karl
On Mon, Aug 6, 2018 at 5:49 A
6
> and 10.
>
> Simple to resolve though.
>
> Steph
>
>
>
>
>
>
> On Fri, Aug 3, 2018 at 1:29 PM, Karl Wright wrote:
>
> Hi Guy,
>
>
>
> I use Postgresql 9.6 myself and have found no issues with it. I don't
> know about v 10 however.
>
&g
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569769#comment-16569769
]
Karl Wright commented on CONNECTORS-1517:
-
Attached a second patch, to be applied
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1517:
Attachment: CONNECTORS-1517-2.patch
> Documentum Connector uses differ
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright resolved CONNECTORS-1517.
-
Resolution: Fixed
tentative fix committed: r1837476
> Documentum Connector u
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569756#comment-16569756
]
Karl Wright commented on CONNECTORS-1517:
-
[~jamesthomas], I've coded a tentative patch
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1517:
Attachment: CONNECTORS-1517.patch
> Documentum Connector uses differ
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569724#comment-16569724
]
Karl Wright commented on CONNECTORS-1517:
-
That's unfortunate, because I don't know DQL
[
https://issues.apache.org/jira/browse/LUCENE-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569594#comment-16569594
]
Karl Wright commented on LUCENE-8445:
-
It worries me that the detection of identical planes needs
[
https://issues.apache.org/jira/browse/LUCENE-8445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned LUCENE-8445:
---
Assignee: Ignacio Vera
> RandomGeoPolygonTest.testCompareBigPolygons() fail
> How can I debug this? Any idea? Jetty have a log file?
>
>
>
> Cordialement,
>
>
>
> [image: msaunier]
>
>
>
>
>
>
>
> *De :* Karl Wright [mailto:daddy...@gmail.com]
> *Envoyé :* mardi 31 juillet 2018 15:32
> *À :* user@manifoldcf.
There must be a reason.
Karl
On Tue, Jul 31, 2018 at 8:18 AM msaunier wrote:
> Hello Karl,
>
>
>
> Today and yesterday, I have an error with Jetty. Jetty crash for no reason.
>
>
>
> Error:
>
> ./start.sh : ligne 41 : 562 Processus arrêté "$JAVA_HOME/bin/java"
> $OPTIONS
Hi Vinay,
Dynamic rescan is meant for web-crawling and revisits already crawled
documents based on how often they have changed in the past. It is
therefore wholly inappropriate for something like a file crawl, since
directory contents (one of the kinds of documents there are in a file
crawl)
[
https://issues.apache.org/jira/browse/TIKA-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562606#comment-16562606
]
Karl Wright commented on TIKA-2693:
---
[~kiwiwings], when is Tika planning to go to POI 4.0.0?
> T
[
https://issues.apache.org/jira/browse/CONNECTORS-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1520:
Attachment: CONNECTORS-1520-2.patch
> Connector registration/deregistration fa
Ok, attached a second fix.
Karl
On Mon, Jul 30, 2018 at 4:09 PM Karl Wright wrote:
> Yes, of course. I overlooked that. Will fix.
>
> Karl
>
>
> On Mon, Jul 30, 2018 at 3:54 PM Mike Hugo wrote:
>
>> That limit only applies to the list of transformations, not the
new MultiClause(jobs.idField,jobIDs)}))
> .append(" FOR UPDATE");
> <<<<<<
>
> Which generates a query with a large OR clause
>
>
> Mike
>
> On Mon, Jul 30, 2018 at 2:44 PM, Karl Wright wrote:
>
>> The limit is app
et set =
> database.performQuery(query.toString(),newList,null,null);
> int i = 0;
> while (i < set.getRowCount())
> {
> IResultRow row = set.getRow(i++);
> Long jobID = (Long)row.getValue(jobs.idField);
> int statusValue =
> jobs.stringToS
[
https://issues.apache.org/jira/browse/CONNECTORS-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright resolved CONNECTORS-1520.
-
Resolution: Fixed
r1837084
> Connector registration/deregistration fails when m
[
https://issues.apache.org/jira/browse/CONNECTORS-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1520:
Attachment: CONNECTORS-1520.patch
> Connector registration/deregistration fails w
Karl Wright created CONNECTORS-1520:
---
Summary: Connector registration/deregistration fails when more
than a certain number of jobs
Key: CONNECTORS-1520
URL: https://issues.apache.org/jira/browse/CONNECTORS-1520
es "OR"
return getMaxOrClause();
}
<<<<<<
The problem is that there was a cut-and-paste error, with just
transformation connections, that defeated the limit. I'll create a ticket
and attach a patch. CONNECTORS-1520.
Karl
On Mon, Jul 30, 2018 at 2:29 PM Karl Wright
loses the
> connection before returning with a response.
>
> As I mentioned this instance of manifold has nearly 40,000 web crawlers.
> is that a high number for Manifold to handle?
>
> On Mon, Jul 30, 2018 at 10:58 AM, Karl Wright wrote:
>
>> Well, I have absolutely no i
or API resource to
> extract documents from Domino server?
>
> Best wishes,
> Cheng
>
> On 30 Jul 2018, at 17:48, Karl Wright wrote:
>
> Hi Cheng,
>
> Dynamic recrawl revisits documents based on the frequency that they
> changed in the past. It is therefore hard t
res run on the same host.
>
> On Mon, Jul 30, 2018 at 9:35 AM, Karl Wright wrote:
>
>> ' LOG: incomplete message from client'
>>
>> This shows a network issue. Did your network configuration change
>> recently?
>>
>> Karl
>>
>>
>&
.createStatement(PgConnection.java:1576)
> at org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:367)
> at org.apache.manifoldcf.core.database.Database.execute(Database.java:873)
> at
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696)
>
Hi Cheng,
Dynamic recrawl revisits documents based on the frequency that they changed
in the past. It is therefore hard to make any prediction about whether a
document will be recrawled in a given time interval. You need recrawls of
existing directories in order to discover new documents in
It looks to me like your database server is not happy. Maybe it's out of
resources? Not sure but a restart may be in order.
Karl
On Sun, Jul 29, 2018 at 9:06 AM Mike Hugo wrote:
> Recently we started seeing this error when Manifold CF starts up. We had
> been running Manifold CF with many
[
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560364#comment-16560364
]
Karl Wright commented on CONNECTORS-1519:
-
Can you have a look at what has changed
Can you view the job and include a screen shot of where this is displayed?
Thanks.
The exclusions are not regexps -- they are file specs. The file specs have
special meanings for "*" (matches everything) and "?" (matches one
character). You do not need to URL encode them.
If you enable
ad I could find it more easily, but I'm
> afraid the crawl is very long.
>
> Maybe you have an idea of the best method to adopt to find this / these
> documents?
>
>
>
> Maxence
>
>
>
> *De :* Karl Wright [mailto:daddy...@gmail.com]
> *Envoyé :* vendredi 2
Hi all,
I've easily spent 40 hours over the last two weeks chasing down bugs in
Apache Tika and POI. The two kinds I see are "ClassNotFound" (due to usage
of the wrong ClassLoader), and "OutOfMemoryError" (not clear what it is due
to yet).
I don't have enough time to create tickets directly in
Hi all,
I've easily spent 40 hours over the last two weeks chasing down bugs in
Apache Tika and POI. The two kinds I see are "ClassNotFound" (due to usage
of the wrong ClassLoader), and "OutOfMemoryError" (not clear what it is due
to yet).
I don't have enough time to create tickets directly in
set:
>
> sudo nano options.env.unix
>
> -Xms2048m
>
> -Xmx2048m
>
>
>
> But I obtain the same error.
>
> My doubt is that it could be a solr/tika problem.
>
> What could I do?
>
> I restrict the scan to a single file and I obtain the same error
>
>
>
Although it is not clear what process you are talking about. If solr ask
them.
Karl
On Fri, Jul 27, 2018, 5:36 AM Karl Wright wrote:
> I am presuming you are using the examples. If so, edit the options file
> to grant more memory to you agents process by increasing the Xmx value.
&g
I am presuming you are using the examples. If so, edit the options file to
grant more memory to you agents process by increasing the Xmx value.
Karl
On Fri, Jul 27, 2018, 3:04 AM Bisonti Mario wrote:
> Hallo.
>
> My job is stucking indexing an xlsx file of 38MB
>
>
>
> What could I do to
[
https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559269#comment-16559269
]
Karl Wright commented on CONNECTORS-1518:
-
[~svanschalkwyk], we don't control how much
[
https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright resolved CONNECTORS-1518.
-
Resolution: Fixed
r1836769
> MCF shutting down when Tika is u
[
https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1518:
Attachment: CONNECTORS-1518.patch
> MCF shutting down when Tika is u
[
https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559082#comment-16559082
]
Karl Wright commented on CONNECTORS-1518:
-
Hi [~svanschalkwyk], the memory usage
[
https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1518:
Fix Version/s: ManifoldCF 2.11
> MCF shutting down when Tika is u
[
https://issues.apache.org/jira/browse/CONNECTORS-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned CONNECTORS-1518:
---
Assignee: Karl Wright
> MCF shutting down when Tika is u
[
https://issues.apache.org/jira/browse/CONNECTORS-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559073#comment-16559073
]
Karl Wright commented on CONNECTORS-1191:
-
Hi [~svanschalkwyk], is there any reason you
On Thu, Jul 26, 2018 at 11:09 AM msaunier wrote:
> On repository connection. I have add « 20971520 » on the max document size.
>
>
>
> Maxence
>
>
>
>
>
> *De :* Karl Wright [mailto:daddy...@gmail.com]
> *Envoyé :* jeudi 26 juillet 2018 17:07
> *À :* us
How are you limiting content size? Is this in the repository connection,
or in an Allowed Documents transformation connection?
Karl
On Thu, Jul 26, 2018 at 10:58 AM msaunier wrote:
> I have limit to 20Mb / document and I have again an out of memory java.
>
>
>
>
>
>
&
I believe there's also a content length tab in the Windows Share connector,
if you're using that.
Karl
On Thu, Jul 26, 2018 at 10:19 AM Karl Wright wrote:
> The ContentLimiter truncates documents. That's not what you want.
>
> Use the Allowed Documents transformer.
>
> Karl
&g
>
> Maxence,
>
>
>
>
>
> *De :* Karl Wright [mailto:daddy...@gmail.com]
> *Envoyé :* mercredi 25 juillet 2018 19:15
> *À :* user@manifoldcf.apache.org
> *Objet :* ***UNCHECKED*** Re: Out of memory, one file bug i think
>
>
>
> It looks like you are still run
Hi Mario,
There is no connection between the number of CPUs and the number output
connections. You pick the maximum number of output connections based on
the number of listening threads that you can use at the same time in Solr.
Karl
On Thu, Jul 26, 2018 at 9:22 AM Bisonti Mario
wrote:
>
[
https://issues.apache.org/jira/browse/CONNECTORS-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558285#comment-16558285
]
Karl Wright commented on CONNECTORS-1516:
-
Fix committed in Apache POI. But now we see
Karl Wright created TIKA-2693:
-
Summary: Tika 1.17 uses the wrong classloader for reflection
Key: TIKA-2693
URL: https://issues.apache.org/jira/browse/TIKA-2693
Project: Tika
Issue Type: Bug
like you may have
put the new poi jars in the wrong place? They should *all* be in
connector-common-lib too.
Karl
On Thu, Jul 26, 2018 at 6:23 AM Karl Wright wrote:
> Hi Maxence,
>
> The following error:
>
> >>>>>>
>
> FATAL 2018-07-26T11:30:32,220 (Wo
onal dependencies.
>
> J2KImageReader not loaded. JPEG2000 files will not be processed.
>
> See https://pdfbox.apache.org/2.0/dependencies.html#jai-image-io
>
> for optional dependencies.
>
>
>
> juil. 26, 2018 11:29:01 AM
> org.apache.tika.config.InitializableProblemHandler$3
&
ed and ingested like others do)?".
>
> Thanks.
>
>
>
> El jue., 26 jul. 2018 a las 0:35, Karl Wright ()
> escribió:
>
>> The crawled URL is transmitted as part of the RepositoryDocument object to
>> the output connector. If this is going to Solr, it's used as
/httpcomponents-client-ga/tutorial/html/statemgmt.html
Karl
On Thu, Jul 26, 2018 at 3:19 AM Karl Wright wrote:
> Ok, so the database for your site crawl contains both z.com and x.y.z.com
> cookies? And your site pages from domain a.y.z.com receive no cookies at
> all whe
it should since A.Y.Z is a sub-domain in
> Z).
>
> Only doing that changes by hand (replacing domain with sub-domain in
> database) and restarting manifold it begins to work.
>
> There might be security constrains somehow, I will consider further
> analysis.
>
> Regards.
>
>
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556937#comment-16556937
]
Karl Wright commented on CONNECTORS-1517:
-
[~jamesthomas], the connector was developed under
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1517:
Fix Version/s: ManifoldCF 2.11
> Documentum Connector uses different "uncon
[
https://issues.apache.org/jira/browse/CONNECTORS-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright reassigned CONNECTORS-1517:
---
Assignee: Karl Wright
> Documentum Connector uses different "uncon
The crawled URL is transmitted as part of the RepositoryDocument object to
the output connector. If this is going to Solr, it's used as the
document's ID. You can therefore customize Solr (or ElasticSearch) to
extract the data you need at the indexing end.
If this doesn't make any sense to you,
; X.Y.Z.com", none of the sub-sites receives that cookie, I need to write
> same cookie for every sub-domain, that solves the situation (and
> thankfully is a language cookie and not a dynamic one).
>
> Regards.
>
> El mié., 25 jul. 2018 a las 19:17, Karl Wright ()
> escrib
sed » document are delete very fast
>
> « Active » documents too.
>
> But « Documents » on the interface, it’s very slow to delete every lines.
> ManifoldCF delete Documents 100 by 100.
>
>
>
> Maxence,
>
>
>
>
>
>
>
> *De :* Karl Wright [mailto:
I'm sorry, I don't understand your question?
Karl
On Wed, Jul 25, 2018 at 12:53 PM msaunier wrote:
> Hi Karl,
>
>
>
> Can I configure ManifoldCF to cleaning up faster ? I think, ManifoldCF
> Clean 100 by 100 by default.
>
>
>
> Maxence,
>
>
>
en the documentation for an example
>> of that.
>>
>> Regards!
>>
>> El jue., 19 jul. 2018 a las 21:54, Karl Wright ()
>> escribió:
>>
>>> You are correct that cookies are not shared among threads. That is by
>>> design.
>>>
&g
It looks like you are still running out of memory. I would love to know
what document it was that doing that. I suspect it is very large already,
and for some reason it cannot be streamed.
Karl
On Wed, Jul 25, 2018 at 1:13 PM Karl Wright wrote:
> Hi Maxence,
>
> The second
.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
> ~[mcf-pull-agent.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
> ~[?:?]
>
> at
> org.apache.manifoldcf.crawler.sys
gt; at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
> ~[mcf-pull-agent.jar:?]
>
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939)
1701 - 1800 of 13502 matches
Mail list logo