Re: Manifold Job process isssue

2021-11-09 Thread Karl Wright
If your docker image's clock is out of sync badly with the real world, then System.currentTimeMillis() may give bogus values, and ManifoldCF uses that to manage throttling etc. I don't know if that is the correct explanation but it's the only thing I can think of. Karl On Tue, Nov 9, 2021 at

Re: Duplicate key error

2021-10-27 Thread Karl Wright
the same problem. If the problem IS repeatable, we will of course look deeper into what is going on. Karl On Wed, Oct 27, 2021 at 9:52 AM Karl Wright wrote: > Is it repeatable? My guess is it is not repeatable. > Karl > > On Wed, Oct 27, 2021 at 4:43 AM ritika jain > wrote: &

Re: Duplicate key error

2021-10-27 Thread Karl Wright
Is it repeatable? My guess is it is not repeatable. Karl On Wed, Oct 27, 2021 at 4:43 AM ritika jain wrote: > So , it can be left as it is.. ? because it is preventing job to complete > and its stopping. > > On Tue, Oct 26, 2021 at 8:40 PM Karl Wright wrote: > >> That's

Re:

2021-10-26 Thread Karl Wright
That's a database bug. All of our underlying databases have some bugs of this kind. Karl On Tue, Oct 26, 2021 at 9:17 AM ritika jain wrote: > Hi All, > > While using Manifoldcf 2.14 with Web connector and ES connector. After a > certain time of continuing the job (jobs ingest some documents

Re: Windows Shares job-Limit on defining no of paths

2021-10-25 Thread Karl Wright
The only limit is that the more you add, the slower it gets. Karl On Mon, Oct 25, 2021 at 6:06 AM ritika jain wrote: > Hi , > Is there any limit on the number of paths we can define in job using > Repository as Window Shares and ES as Output > > Thanks >

Re: Null Pointer Exception

2021-10-25 Thread Karl Wright
The API should really catch this situation. Basically, you are calling a function that requires an input but you are not providing one. In that case the API sets the input to "null", and the detailed operation is called. The detailed operation is not expecting a null input. This is API piece

[jira] [Resolved] (CONNECTORS-1675) Unable to delete Mapping Connections via JSON API

2021-10-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1675. - Fix Version/s: ManifoldCF 2.21 Resolution: Fixed r1894400 > Una

[jira] [Assigned] (CONNECTORS-1675) Unable to delete Mapping Connections via JSON API

2021-10-19 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1675: --- Assignee: Kishore Kumar > Unable to delete Mapping Connections via JSON

[jira] [Assigned] (CONNECTORS-1675) Unable to delete Mapping Connections via JSON API

2021-10-19 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1675: --- Assignee: Karl Wright (was: Kishore Kumar) > Unable to delete Mapp

[jira] [Updated] (CONNECTORS-1674) KEYS file must be called KEYS

2021-10-05 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1674: Fix Version/s: ManifoldCF next > KEYS file must be called K

[jira] [Assigned] (CONNECTORS-1674) KEYS file must be called KEYS

2021-10-05 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1674: --- Assignee: Karl Wright > KEYS file must be called K

[jira] [Updated] (CONNECTORS-1660) Patch for MCF HTML extractor connector

2021-10-04 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1660: Fix Version/s: (was: ManifoldCF 2.18) ManifoldCF next > Pa

[jira] [Resolved] (CONNECTORS-1673) Download page must use https for sigs and hashes

2021-10-04 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1673. - Fix Version/s: ManifoldCF 2.20 Resolution: Fixed 2.20 release update should

[jira] [Assigned] (CONNECTORS-1673) Download page must use https for sigs and hashes

2021-10-04 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1673: --- Assignee: Karl Wright > Download page must use https for sigs and has

[RESULT] [VOTE] Release Apache ManifoldCF 2.20, RC0

2021-10-01 Thread Karl Wright
Three +1's, >72 hours. Vote passes! Karl On Fri, Oct 1, 2021 at 10:00 AM Karl Wright wrote: > Ran tests, verified things still work. +1 from me. > > Karl > > > On Mon, Sep 27, 2021 at 12:46 PM Cihad Guzel wrote: > >> +1 >> >> Cihad Güzel >> &g

Re: [VOTE] Release Apache ManifoldCF 2.20, RC0

2021-10-01 Thread Karl Wright
Ran tests, verified things still work. +1 from me. Karl On Mon, Sep 27, 2021 at 12:46 PM Cihad Guzel wrote: > +1 > > Cihad Güzel > > > adresine sahip kullanıcı 26 Eyl 2021 Paz, > 15:06 tarihinde şunu yazdı: > > > +1 > > > > Julien > > >

Re: Error: Repeated service interruptions - failure processing document: Read timed out

2021-09-30 Thread Karl Wright
Hi, You say this is a "Tika error". Is this Tika as a stand-alone service? I do not recognize any ManifoldCF classes whatsoever in this thread dump. If this is Tika, I suggest contacting the Tika team. Karl On Thu, Sep 30, 2021 at 3:02 AM Bisonti Mario wrote: > Additional info. > > > > I

[VOTE] Release Apache ManifoldCF 2.20, RC0

2021-09-25 Thread Karl Wright
Please vote on whether to release Apache ManifoldCF 2.20, RC0. This release has a new connector in it, and a few bug fixes, but is otherwise pretty light. Nevertheless, it's a month behind schedule so I'm calling a vote for release, by the end of the month. The release artifact can be found at:

[jira] [Updated] (CONNECTORS-1671) Solr output connector behavior on some exceptions

2021-09-08 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1671: Fix Version/s: (was: ManifoldCF 2.19) ManifoldCF 2.20 > S

[jira] [Assigned] (CONNECTORS-1671) Solr output connector behavior on some exceptions

2021-09-08 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1671: --- Fix Version/s: (was: ManifoldCF next) ManifoldCF 2.19

Re: Tika Parser Issue

2021-09-07 Thread Karl Wright
This is something you should contact the Tika project about. Karl On Tue, Sep 7, 2021 at 8:46 AM ritika jain wrote: > Hi All, > > I am using tika-core 1.21 and tika-parsers 1.21 jar files as tika > dependencies in Manifoldcf 2.14 version. > Getting some issues while parsing *PDF *files. Some

Re: Query:JCIFS connector

2021-08-23 Thread Karl Wright
I have a work day today, with limited time. The UI is what it is; it does not have capabilities beyond what is stated in the UI and in the manual. It's meant to allow construction of paths piece by piece, not by full subdirectory at a time. You can obviously use the API if you want to construct

Re: Job Deletion query

2021-08-12 Thread Karl Wright
Yes, when you delete a job, the indexed documents associated with that job are removed from the index. ManifoldCF is a synchronizer, not a crawler, so when you remove the synchronization job then if it didn't delete the indexed documents they would be left dangling. Karl On Thu, Aug 12, 2021

Re: Window shares dynamic Job issue

2021-08-11 Thread Karl Wright
t;_attribute_indexable":"yes","_attribute_filespec":"\/*.docb","_value_":"","_attribute_type":"file"},{"_attribute_indexable":"yes","_attribute_filespec":"\/*.dot","_value_&q

Re: Window shares dynamic Job issue

2021-08-10 Thread Karl Wright
I am sorry, but I'm having trouble understanding how exactly you are configuring the JCIFS connector in these two cases.Can you view the job in each case and provide cut-and-paste of the view? Karl On Tue, Aug 10, 2021 at 9:09 AM ritika jain wrote: > Hi All, > > I am using Window shares

[jira] [Commented] (CONNECTORS-1671) Solr output connector behavior on some exceptions

2021-07-31 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390936#comment-17390936 ] Karl Wright commented on CONNECTORS-1671: - [~julienFL], you are catching RuntimeException

Re: JCIFS Connector File Size Attribute

2021-07-26 Thread Karl Wright
> "originalSize". > > King Regards, > Uwe > > > > > -Ursprüngliche Nachricht- > Von: Karl Wright > Gesendet: Freitag, 23. Juli 2021 20:34 > An: dev > Betreff: Re: JCIFS Connector File Size Attribute > > Hi, > The original size

Re: JCIFS Connector File Size Attribute

2021-07-23 Thread Karl Wright
Hi, The original size field is provided by the Repository Connector, and passed to the output connector. In this case, the code that sets the field is here: kawright@1USDKAWRIGHT:/mnt/c/wip/mcf/trunk$ grep -R "rd.setOriginalSize(originalLength);" . --include "*.java"

Re: Solr output connector - behavior on some exceptions

2021-07-13 Thread Karl Wright
It is called the "Lucene/Solr Connector" component. Karl On Tue, Jul 13, 2021 at 10:11 AM wrote: > Ok, there is no "Solr connector" component in JIRA, can you add it please > ? > > -----Message d'origine- > De : Karl Wright > Envoyé : mardi 13 juil

Re: Solr output connector - behavior on some exceptions

2021-07-13 Thread Karl Wright
had a > crawling job stopped because of an exception concerning a document having a > null value for a specific metadata and another one with a value that > triggered a request parsing issue on Solr side. > > Julien > > -Message d'origine- > De : Karl Wright > Env

Re: Solr output connector - behavior on some exceptions

2021-07-13 Thread Karl Wright
If the "solr is down" exceptions are indeed caught upstream, I'm tentatively in agreement that this fallback logic can be changed. But I would like to understand what specifically you are seeing this happen for. What cases are you hoping to improve? Karl On Tue, Jul 13, 2021 at 9:39 AM wrote:

Re: Is the Web connector supporting zipped sitemap.xml.gz referenced by robots.txt?

2021-07-07 Thread Karl Wright
+; > return new > > UrlsetContextClass(theStream,namespace,localName,qName,atts,documentURI,handler); >} > > So, my question is: is there another way to handle sitemaps inside the > Web Crawler? > > Cheers Sebastian > > > > > > Am 07

Re: Is the Web connector supporting zipped sitemap.xml.gz referenced by robots.txt?

2021-07-07 Thread Karl Wright
The robots parsing does not recognize the "sitemaps" line, which was likely not in the spec for robots when this connector was written. Karl On Wed, Jul 7, 2021 at 3:31 AM h0444xk8 wrote: > Hi, > > I have a general question. Is the Web connector supporting sitemap files > referenced by the

[jira] [Commented] (CONNECTORS-1670) PostgreSQL: transaction in progress

2021-07-06 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17375979#comment-17375979 ] Karl Wright commented on CONNECTORS-1670: - It's not clear that this is anything other than

Re: Commit on CONNECTORS-1667 branch

2021-06-25 Thread Karl Wright
Thanks! No, I haven't had time to integrate this, but if the branch is ready, I'd be happy to pull it in now. Please let me know. Karl On Fri, Jun 25, 2021 at 9:52 AM wrote: > Hi Karl, > > > > I needed to patch my contribution for a new Tika connector from 3 months > ago: I made the

[jira] [Resolved] (LUCENE-10012) Cache concurrency for GeoStandardPath is poorly designed

2021-06-22 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved LUCENE-10012. -- Fix Version/s: main (9.0) Resolution: Fixed > Cache concurrency for GeoStandardP

[jira] [Created] (LUCENE-10012) Cache concurrency for GeoStandardPath is poorly designed

2021-06-22 Thread Karl Wright (Jira)
Karl Wright created LUCENE-10012: Summary: Cache concurrency for GeoStandardPath is poorly designed Key: LUCENE-10012 URL: https://issues.apache.org/jira/browse/LUCENE-10012 Project: Lucene - Core

Re: Manifoldcf Redirection process

2021-05-28 Thread Karl Wright
302 does get recognized as a redirection, yes On Fri, May 28, 2021 at 5:07 AM ritika jain wrote: > Is the process the same when fetch/process status code returned is 302 ? When running a job with web crawler and ES output connector >>> > can anybody have a clue about this >

[jira] [Commented] (CONNECTORS-1668) Use of Wild Characters in SharePoint Connector.

2021-05-23 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17350025#comment-17350025 ] Karl Wright commented on CONNECTORS-1668: - If you think you have a web service call

[jira] [Commented] (CONNECTORS-1668) Use of Wild Characters in SharePoint Connector.

2021-05-22 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349817#comment-17349817 ] Karl Wright commented on CONNECTORS-1668: - About whether we can implement "site disc

[jira] [Commented] (CONNECTORS-1668) Use of Wild Characters in SharePoint Connector.

2021-05-22 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349815#comment-17349815 ] Karl Wright commented on CONNECTORS-1668: - The logic for path rules is as follows: {code

[jira] [Assigned] (CONNECTORS-1668) Use of Wild Characters in SharePoint Connector.

2021-05-22 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1668: --- Assignee: Karl Wright > Use of Wild Characters in SharePoint Connec

[jira] [Commented] (CONNECTORS-1668) Use of Wild Characters in SharePoint Connector.

2021-05-22 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17349807#comment-17349807 ] Karl Wright commented on CONNECTORS-1668: - Could you view your job, and include a screen

Re: Manifoldcf Redirection process

2021-05-19 Thread Karl Wright
ManifoldCF reads all the URLs on its queue. If it's a 301, it detects this and pushes the new URL onto the document queue. When it gets to the new URL, it processes it like any other. Karl On Wed, May 19, 2021 at 8:32 AM ritika jain wrote: > Hi > > I want to understand the process of "How

Re: Interrupted while acquiring credits

2021-05-14 Thread Karl Wright
e job or crashing manifold > > On Fri, May 14, 2021 at 1:34 PM Karl Wright wrote: > >> ' >> >> *JCIFS: Possibly transient exception detected on attempt 1 while getting >> share security'Yes, it is going to retry.* >> >> *Karl* >> >>

Re: Interrupted while acquiring credits

2021-05-14 Thread Karl Wright
' *JCIFS: Possibly transient exception detected on attempt 1 while getting share security'Yes, it is going to retry.* *Karl* On Fri, May 14, 2021 at 1:45 AM ritika jain wrote: > Hi, > I am using Windows shares connector in manifoldcf 2.14 and ElasticSearch > connector as Output connector and

Re: CONNECTORS-1667 integration to trunk ?

2021-05-12 Thread Karl Wright
Hi Julien, I was occupied with several work-related escalations and trying to get 2.19 out the door. I will have time this weekend to review the new connector but for right now can you hold off? Thanks! Karl On Wed, May 12, 2021 at 10:56 AM wrote: > Hi Karl, > > > > Is that ticket OK for

Re: Notification connector error

2021-05-11 Thread Karl Wright
This used to work fine, but I suspect that when SSH was declared unsafe, it was disabled, and now only TLS will work. Karl On Tue, May 11, 2021 at 12:13 PM wrote: > Hello, > > > > I am trying to use an email notification connector but without success. > When the connector tries to send an

[RESULT] [VOTE] Release Apache ManifoldCF 2.19, RC1

2021-05-10 Thread Karl Wright
s and all is OK. > > But the maven tests fail on the alfresco webscript tests and it is > impossible to skip them... > > Still, it is a +1 > > Julien > > -----Message d'origine- > De : Karl Wright > Envoyé : jeudi 6 mai 2021 12:37 > À : dev > Objet : Re: [V

Re: [VOTE] Release Apache ManifoldCF 2.19, RC1

2021-05-06 Thread Karl Wright
Ran tests, including Alfresco webscript one. +1 from me. Karl On Thu, May 6, 2021 at 5:28 AM Karl Wright wrote: > Please vote to release Apache ManifoldCF 2.19, RC1. The release artifact > can be found at: > https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifo

[VOTE] Release Apache ManifoldCF 2.19, RC1

2021-05-06 Thread Karl Wright
Please vote to release Apache ManifoldCF 2.19, RC1. The release artifact can be found at: https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.19 . There is also a release tag at https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.19-RC1 . This release has a significant

Re: [VOTE] Release Apache ManifoldCF 2.19, RC0

2021-05-05 Thread Karl Wright
I have a patch for this and will spin a new RC. Karl On Wed, May 5, 2021 at 1:43 PM Karl Wright wrote: > After 1 1/2 hours spent downloading, I see the issue: > > >>>>>> > [junit] java.lang.IllegalStateException: N

Re: [VOTE] Release Apache ManifoldCF 2.19, RC0

2021-05-05 Thread Karl Wright
va:864) [junit] at org.eclipse.jetty.security.HashLoginService.setUserStore(HashLoginService.java:126) <<<<<< This is likely due to a recent Jetty upgrade - I think it was last release cycle. It sounds like we need to provide a configuration path now for Jetty to come up. Karl On Wed, May 5

Re: [VOTE] Release Apache ManifoldCF 2.19, RC0

2021-05-05 Thread Karl Wright
I'm having severe bandwidth issues with my internet connection today. I will have to wait to try this completely until that clears up. Karl On Wed, May 5, 2021 at 8:25 AM Karl Wright wrote: > I just run "ant test". The alfresco test doesn't run, though, unless you > run "

Re: [VOTE] Release Apache ManifoldCF 2.19, RC0

2021-05-05 Thread Karl Wright
r testing everything? > Maybe I'm missing something. > Cheers, > PJ > > Il giorno gio 29 apr 2021 alle ore 01:21 Karl Wright > ha scritto: > > > Please vote to release Apache ManifoldCF 2.19, RC0. The release artifact > > can be found at: > > https://dist.a

Re: [VOTE] Release Apache ManifoldCF 2.19, RC0

2021-05-03 Thread Karl Wright
Reminder: Voting underway. Please evaluate and vote. Ran tests. +1 from me. Karl On Wed, Apr 28, 2021 at 7:21 PM Karl Wright wrote: > Please vote to release Apache ManifoldCF 2.19, RC0. The release artifact > can be found at: > https://dist.apache.org/repos/dist/dev/manifold

[VOTE] Release Apache ManifoldCF 2.19, RC0

2021-04-28 Thread Karl Wright
Please vote to release Apache ManifoldCF 2.19, RC0. The release artifact can be found at: https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.19 . There is also a release tag at https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.19-RC0 . This release has a significant

I've created the 2.19 release branch, and will spin an RC0 this evening, with luck

2021-04-28 Thread Karl Wright
Voting will commence when the candidate is ready. Please be ready to evaluate it. Thanks, Karl

It's release time again

2021-04-12 Thread Karl Wright
I'd like to build RC0 by the end of the week, so if there are any pressing issues I don't know about, this would be the time to address them. The one thing I know that is still outstanding is the elastic search connector patch that addresses a changed field name. I have been quite busy and have

Re: General questions

2021-04-12 Thread Karl Wright
Hi, There was a book written but never published on ManifoldCF and how to write connectors. It's meant to be extended in that way. The PDFs for the book are available for free online, and they are linked through the manifoldcf web site. Karl On Mon, Apr 12, 2021 at 8:49 AM koch wrote: > Hi

Re: Manifoldcf Deletion Process

2021-03-30 Thread Karl Wright
Hi Ritika, There is no deletion process. Deletion takes place when a job is run in a mode where deletion is possible (there are some where it is not). The way it takes place depends on the kind of repository connector (what model it declares itself to use). For the most common kinds of

Re: How to override carry down data

2021-03-21 Thread Karl Wright
connector calling the IProcessActivity methods meant to signal that document processing has finished? If not, that is the problem! Karl On Sun, Mar 21, 2021 at 9:14 PM Karl Wright wrote: > Ah, so it appears that the way this works is subtle and clever. > > Values are added or updated in

Re: How to override carry down data

2021-03-21 Thread Karl Wright
ap,sb.toString(),newList,null); noteModifications(0,list.size(),0); So the question becomes: does it get called appropriately? Karl On Sun, Mar 21, 2021 at 8:45 PM Karl Wright wrote: > I've tried to refresh my memory by looking at the carrydown code, which is > quite old at

Re: How to override carry down data

2021-03-21 Thread Karl Wright
retrieves a contentArray containing 2 > values, the old one "someContent", and the new one "newContent" > > I can guarantee that the parentIdentifier between the two crawls is the > same and that on the second crawl, only the "newContent" is added, I > debugged

Re: How to override carry down data

2021-03-21 Thread Karl Wright
Can you give me a code example? The carry-down information is set by the parent, as you say. The specific information is keyed to the parent so when the child is added to the queue, all old carrydown information from the same parent is deleted at that time, and until that happens the carrydown

Re: Another Elasticsearch patch to allow the long URI

2021-03-20 Thread Karl Wright
to be sure that the connector works with most versions of ElasticSearch? Please help clarify so that I can finish this off. The changes are committed to trunk; I would be very appreciative if Shirai Takashi/ 白井隆 reviewed them there.Thanks! Karl On Sat, Mar 20, 2021 at 4:32 AM Karl Wright wrot

[jira] [Commented] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305410#comment-17305410 ] Karl Wright commented on CONNECTORS-1666: - r1887848 updates Japanese translations included

[jira] [Commented] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305408#comment-17305408 ] Karl Wright commented on CONNECTORS-1666: - r1887847 adds support for ingestattachment

Re: Another Elasticsearch patch to allow the long URI

2021-03-20 Thread Karl Wright
. There are more changes in these patches than just the ID length issue. I am working to add this functionality as well but without anything I would consider to be unneeded. Karl On Fri, Mar 19, 2021 at 3:48 AM Karl Wright wrote: > Thanks for the information. I'll see what I can do. >

[jira] [Commented] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305365#comment-17305365 ] Karl Wright commented on CONNECTORS-1666: - r1887840 submits my take on what is necessary

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Description: The size of the ElasticSearch ID field is severely limited. We

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-elastic-id-2.patch > ElasticSearch connector cannot

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-elastic-id.patch > ElasticSearch connector cannot

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-2.18-elastic-id.patch > ElasticSearch connector can

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: (was: apache-manifoldcf-2.18-elastic-id.patch) > ElasticSea

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-2.18-elastic-id.patch > ElasticSearch connector can

[jira] [Commented] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17305344#comment-17305344 ] Karl Wright commented on CONNECTORS-1666: - {quote} Hi, there. I've found another trouble

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-elastic-id.patch.gz > ElasticSearch connector can

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-2.18-elastic-id.patch.gz > ElasticSearch connec

[jira] [Updated] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1666: Attachment: apache-manifoldcf-elastic-id-2.patch.gz > ElasticSearch connector can

[jira] [Created] (CONNECTORS-1666) ElasticSearch connector cannot use full URLs for IDs

2021-03-20 Thread Karl Wright (Jira)
Karl Wright created CONNECTORS-1666: --- Summary: ElasticSearch connector cannot use full URLs for IDs Key: CONNECTORS-1666 URL: https://issues.apache.org/jira/browse/CONNECTORS-1666 Project

Re: Another Elasticsearch patch to allow the long URI

2021-03-19 Thread Karl Wright
Thanks for the information. I'll see what I can do. Karl On Thu, Mar 18, 2021 at 7:23 PM Shirai Takashi/ 白井隆 wrote: > Hi, Karl. > > Karl Wright wrote: > >Hi - I'm still waiting for this patch to be attached to a ticket. That is > >the only way I believe we're allowed

Re: Another Elasticsearch patch to allow the long URI

2021-03-18 Thread Karl Wright
Hi - I'm still waiting for this patch to be attached to a ticket. That is the only way I believe we're allowed to accept it legally. Karl On Thu, Mar 4, 2021 at 7:16 PM Shirai Takashi/ 白井隆 wrote: > Hi, Karl. > > Karl Wrightさんは書きました: > >I agree it is unlikely that the JDK wi

Re: Inactive MCF agent

2021-03-16 Thread Karl Wright
level in MCF ? > > Regards, > Julien > > > -Message d'origine- > De : Karl Wright > Envoyé : mardi 2 mars 2021 19:17 > À : dev > Objet : Re: Inactive MCF agent > > The MCF Agents process shouldn't get hung up under normal operation. If > it encounte

Re: Add activity records to the web connector

2021-03-09 Thread Karl Wright
Yes, please go ahead Karl On Tue, Mar 9, 2021 at 11:06 AM wrote: > Hi Karl, > > > > I would like to add more activity records in the web connector to keep > track > in the simple history of filtered URLs that would match exclude filters. > May > I create a ticket for this and propose a patch ?

Re: Another Elasticsearch patch to allow the long URI

2021-03-04 Thread Karl Wright
I agree it is unlikely that the JDK will lose support for SHA-1 because it is used commonly, as is MD5. So please feel free to use it. Karl On Wed, Mar 3, 2021 at 7:54 PM Shirai Takashi/ 白井隆 wrote: > Hi, Horn. > > Jörn Franke wrote: > >Makes sense > > I don't think that it's easy. > > > >>>

Re: Inactive MCF agent

2021-03-02 Thread Karl Wright
The MCF Agents process shouldn't get hung up under normal operation. If it encounters a problem that may call its continued activity into question, it shuts itself down. There are two situations where the process could theoretically hang. The first is when you are using file-based synch, and

Re: Another Elasticsearch patch to allow the long URI

2021-03-02 Thread Karl Wright
Hi - this is very helpful. I would like you to officially create a ticket in Jira: https://issues.apache.org/jira , project "CONNECTORS", and attach these patches. Backwards compatibility means that we very likely have to use the hash approach, and not use the decoding approach. Thanks, Karl

Re: Congratulations to the new Lucene PMC Chair, Michael Sokolov!

2021-02-20 Thread Karl Wright
Congratulations! On Sat, Feb 20, 2021 at 4:17 PM Namgyu Kim wrote: > Congratulations, Mike! :D > > On Thu, Feb 18, 2021 at 6:32 AM Anshum Gupta > wrote: > >> Every year, the Lucene PMC rotates the Lucene PMC chair and Apache Vice >> President position. >> >> This year we nominated and elected

Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-20 Thread Karl Wright
Congratulations! Karl On Sat, Feb 20, 2021 at 6:28 AM Uwe Schindler wrote: > Congrats Jan! > > > > Uwe > > > > - > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* Anshum Gupta > *Sent:* Thursday, February 18, 2021

Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-20 Thread Karl Wright
Congratulations! Karl On Sat, Feb 20, 2021 at 6:28 AM Uwe Schindler wrote: > Congrats Jan! > > > > Uwe > > > > - > > Uwe Schindler > > Achterdiek 19, D-28357 Bremen > > https://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > *From:* Anshum Gupta > *Sent:* Thursday, February 18, 2021

Re: Multiprocess file installation of manifold

2021-02-17 Thread Karl Wright
File synchronization is still supported but is deprecated. We recommend zookeeper synchronization unless you have a very good reason not to. Karl On Wed, Feb 17, 2021 at 12:26 PM Ananth Peddinti wrote: > Hello Team , > > > I would like to know if someone has already done multi-process model

Re: Job Content Length issue

2021-02-17 Thread Karl Wright
ue, Feb 16, 2021 at 7:29 PM Karl Wright wrote: > >> Hi, do you mean content limiter length of 100? >> >> I assume you are using the internal Tika transformer? Are you combining >> this with a Solr output connection that is not using the extract handler? >> >

Re: Job Content Length issue

2021-02-16 Thread Karl Wright
Hi, do you mean content limiter length of 100? I assume you are using the internal Tika transformer? Are you combining this with a Solr output connection that is not using the extract handler? By "manifold crashes" I assume you actually mean it runs out of memory. The "long running query"

Re: content length tab

2021-02-15 Thread Karl Wright
This parameter is in bytes. Karl On Mon, Feb 15, 2021 at 9:03 AM ritika jain wrote: > Hi Users, > > Can anybody tell me if this can be filled as bytes or kilobytes here. > > The "Content Length tab looks like this: > > > [image: Windows Share Job, Content Length tab] > > Values are to be

[jira] [Commented] (CONNECTORS-1656) HTML extractor produces invalid XML

2021-02-12 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17283929#comment-17283929 ] Karl Wright commented on CONNECTORS-1656: - The patch is fine. I was not notified

Re: GSOC - Mavenisation of MCF ?

2021-02-10 Thread Karl Wright
ith Karl, > > for GSoC purposes it's better to propose something related to an > > independent and well defined connector. > > PJ > > > > Il giorno lun 8 feb 2021 alle ore 12:57 Karl Wright > > ha > > scritto: > > > > > There are alread

Re: GSOC - Mavenisation of MCF ?

2021-02-08 Thread Karl Wright
There are already poms throughout. However, release distribution structure cannot be built with Maven at this time because the directory structure of the final release artifact is complex, and because we want individual external connector developers to be able to "add into" this. In other words,

Re: JIRA Authority connector - Remove potential domain in username

2021-01-29 Thread Karl Wright
What you need is a user mapping. See: http://manifoldcf.apache.org/release/release-2.18/en_US/end-user-documentation.html#mappers On Fri, Jan 29, 2021 at 7:16 AM wrote: > Hi, > > > > In my use cases, as I often combine several authorities with the Active > Directory authority, I request the

[jira] [Assigned] (CONNECTORS-1662) JIRA connector - NullPointerException after getCharSet method

2021-01-29 Thread Karl Wright (Jira)
[ https://issues.apache.org/jira/browse/CONNECTORS-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1662: --- Assignee: Karl Wright > JIRA connector - NullPointerException after getChar

<    1   2   3   4   5   6   7   8   9   10   >