Re: Sharepoint Crawl - Missing documents

2019-03-06 Thread Karl Wright
est-2520timed-2520out-252E=DwMBAg=jf_iaSHvJObTbx-siA1ZOg=L3XDbgcveKXY09WuM4g0WK0Ca_8lalmsmiQPK25oTvA=xxh5A_i6IjyQVUX0-fNKyJ_UUDjmO1iYcIelg2QUkfI=zCG_yFCuE-cyRES6kPDH5JXbj300shEQ7bDzQzrX6uU=> > [\r][\n]" > > Thanks, > > Gaurav > > On Wed, Mar 6, 2019 at 4:44 PM Karl W

Re: Sharepoint Crawl - Missing documents

2019-03-06 Thread Karl Wright
; On Wed, Mar 6, 2019 at 12:33 PM Karl Wright wrote: > >> Hi Guarav, >> Which version of SharePoint is this? And, did you install the SharePoint >> plugin for ManifoldCF, and select the correct versions of SharePoint in the >> connection configuration? >> >&g

Re: Sharepoint Crawl - Missing documents

2019-03-05 Thread Karl Wright
e.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227) > ~[?:?] > at org.apache.axis.SOAPPart.getAsSOAPEnvelope(SOAPPart.java:696) > ~[?:?] > ... 12 more > WARN 2019-03-05T23:50:15,602 (Worker thread '6') - Service interruption > reported for jo

Re: Log Framework Zoo

2019-03-05 Thread Karl Wright
Two years ago I moved the standard logger for all of ManifoldCF to log4j 2. This was a non-backwards-compatible change but it was forced on us because our downstream connector dependencies started to require it. However, there are still log4j 1.2 dependencies we cannot get rid of due to other

Re: Sharepoint Crawl - Missing documents

2019-03-04 Thread Karl Wright
Hi Gaurav, There is no document count threshold value. If you can identify libraries or subsites that aren't being crawled, you can turn on connector debugging to see why the connector is skipping them. There could be many reasons for a library or site to be skipped, e.g. bad specification rules,

Re: Welcome Ignacio Vera to the PMC

2019-03-04 Thread Karl Wright
Welcome, Ignacio! Karl On Mon, Mar 4, 2019 at 4:51 AM Alan Woodward wrote: > Congratulations and welcome, Ignacio! > > > On 4 Mar 2019, at 09:09, Adrien Grand wrote: > > > > I am pleased to announce that Ignacio Vera has accepted the PMC's > > invitation to join. > > > > Welcome Ignacio! > > >

[jira] [Assigned] (CONNECTORS-1590) Resources should be closed in a finally block

2019-03-03 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1590: --- Assignee: Karl Wright > Resources should be closed in a finally bl

[jira] [Resolved] (CONNECTORS-1590) Resources should be closed in a finally block

2019-03-03 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1590. - Resolution: Won't Fix > Resources should be closed in a finally bl

[jira] [Commented] (CONNECTORS-1590) Resources should be closed in a finally block

2019-03-03 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782728#comment-16782728 ] Karl Wright commented on CONNECTORS-1590: - This particular invocation is only ever invoked

[jira] [Resolved] (CONNECTORS-1589) lrusize always null

2019-03-03 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1589. - Resolution: Fixed r1854702 > lrusize always n

[jira] [Commented] (CONNECTORS-1589) lrusize always null

2019-03-03 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782724#comment-16782724 ] Karl Wright commented on CONNECTORS-1589: - This requires an infrastructure change

[jira] [Assigned] (CONNECTORS-1589) lrusize always null

2019-03-03 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1589: --- Assignee: Karl Wright > lrusize always n

[jira] [Assigned] (CONNECTORS-1588) Custom Jcifs Properties

2019-02-28 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1588: --- Assignee: Karl Wright > Custom Jcifs Propert

[jira] [Commented] (CONNECTORS-1588) Custom Jcifs Properties

2019-02-28 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780473#comment-16780473 ] Karl Wright commented on CONNECTORS-1588: - Patch looks fine. I'll commit it. > Cus

[jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes

2019-02-28 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780373#comment-16780373 ] Karl Wright commented on CONNECTORS-1563: - Hi [~Subasini], The "excluded mime

Re: Difference between Maximum document length and Max file size

2019-02-27 Thread Karl Wright
Hi Cihad, For "Maximum document length", you are talking about the Solr connector, correct? In that case it is the maximum size of extracted content that will be sent to Solr. (The connector assumes that when you aren't using the /update/extract handler you are extracting the content upstream

[jira] [Resolved] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-26 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved LUCENE-8696. - Resolution: Fixed Fix Version/s: 7.7.2 master (9.0

[jira] [Commented] (CONNECTORS-1587) Unable to Crawl Documents Meta data

2019-02-26 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778989#comment-16778989 ] Karl Wright commented on CONNECTORS-1587: - It is simple; the crawler is requesting more

[jira] [Resolved] (CONNECTORS-1587) Unable to Crawl Documents Meta data

2019-02-26 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1587. - Resolution: Invalid Not a ManifoldCF bug > Unable to Crawl Documents Meta d

[jira] [Commented] (CONNECTORS-1564) Support preemptive authentication to Solr connector

2019-02-26 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778159#comment-16778159 ] Karl Wright commented on CONNECTORS-1564: - [~erlendfg], if ModifiedHttpSolrClient overrides

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-26 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778005#comment-16778005 ] Karl Wright commented on LUCENE-8696: - I have confirmed that the above is indeed the issue. I did

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777688#comment-16777688 ] Karl Wright commented on LUCENE-8696: - Since we've eliminated the computation of the solid's example

[jira] [Comment Edited] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776899#comment-16776899 ] Karl Wright edited comment on LUCENE-8696 at 2/26/19 7:24 AM: -- Reviewing

Re: custom jcifs properties

2019-02-25 Thread Karl Wright
gt; > Add "-Djcifs.resolveOrder = LMHOSTS,DNS,WINS" to "start-options.env" file. > > If you have been convinced about this, I can create a PR. > > [1] https://www.jcifs.org/src/docs/resolver.html > [2] https://stackoverflow.com/a/18837754 > > Regards, > Cihad G

Re: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-25 Thread Karl Wright
Hi, any news here? Karl On Wed, Feb 20, 2019 at 1:35 PM Karl Wright wrote: > No, I stand corrected: the right class is in that jar: > > >>>>>> > C:\wip\mcf\trunk\dist\lib-proprietary>"c:\Program > Files\Java\jdk1.8.0_181\bin\jar" -tf jtds-1.2.4

[jira] [Commented] (SOLR-13270) SolrJ does not send "Expect: 100-continue" header

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777186#comment-16777186 ] Karl Wright commented on SOLR-13270: I just grepped for it and did not find it explicitly set: {code

[jira] [Commented] (SOLR-13270) SolrJ does not send "Expect: 100-continue" header

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777113#comment-16777113 ] Karl Wright commented on SOLR-13270: Hi [~erlendfg], can you identify where in the SolrJ code

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776975#comment-16776975 ] Karl Wright commented on LUCENE-8696: - [~jpountz], should be addressed now

[jira] [Comment Edited] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776899#comment-16776899 ] Karl Wright edited comment on LUCENE-8696 at 2/25/19 2:39 PM: -- Reviewing

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776899#comment-16776899 ] Karl Wright commented on LUCENE-8696: - Reviewing the solid, and what the edge points *should

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776881#comment-16776881 ] Karl Wright commented on LUCENE-8696: - Reviewing the solid edge point logic finds nothing wrong

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776841#comment-16776841 ] Karl Wright commented on LUCENE-8696: - I've verified that there are two solid edge points

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776821#comment-16776821 ] Karl Wright commented on LUCENE-8696: - Looking at the actual failure now. Basically, problem

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776637#comment-16776637 ] Karl Wright commented on LUCENE-8696: - I revised the simple test case to match the actual failure

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-25 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776617#comment-16776617 ] Karl Wright commented on LUCENE-8696: - [~ivera], I'm looking at your test case for reproducing

Re: custom jcifs properties

2019-02-24 Thread Karl Wright
These settings were provided by the developer of jcifs, Michael Allen. You have to really understand the protocol well before you should consider changing them in any way. Thanks, Karl On Sun, Feb 24, 2019 at 9:53 AM Cihad Guzel wrote: > Hi, > > SharedDriveConnector have some hardcoded

Re: custom jcifs properties

2019-02-24 Thread Karl Wright
These settings were provided by the developer of jcifs, Michael Allen. You have to really understand the protocol well before you should consider changing them in any way. Thanks, Karl On Sun, Feb 24, 2019 at 9:53 AM Cihad Guzel wrote: > Hi, > > SharedDriveConnector have some hardcoded

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-23 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776010#comment-16776010 ] Karl Wright commented on LUCENE-8696: - More debugging shows that the second circle plane is wildly

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-23 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776006#comment-16776006 ] Karl Wright commented on LUCENE-8696: - Added some simple diagnostics. The difference lies

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-23 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776002#comment-16776002 ] Karl Wright commented on LUCENE-8696: - Hmm, even when I use createSurfacePoint() with this point

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-23 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775839#comment-16775839 ] Karl Wright commented on LUCENE-8696: - Preliminary results indicate that the problem may be due

[jira] [Commented] (CONNECTORS-1587) Unable to Crawl Documents Meta data

2019-02-22 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775088#comment-16775088 ] Karl Wright commented on CONNECTORS-1587: - Can you amend your ticket to tell us what

Re: Sharepoint incremental crawl - last version

2019-02-22 Thread Karl Wright
Hi Gaurav, Yes, we can add fields to how the lastmodified column is computed, provided the information is available via web services. Please propose a patch. Thanks, Karl On Fri, Feb 22, 2019 at 7:16 AM Gaurav G wrote: > Hi All, > > We're facing a problem in getting updated sharepoint

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-22 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775050#comment-16775050 ] Karl Wright commented on LUCENE-8696: - The path in the test retraces its steps, but that should

Re: ManifoldCF Website Links

2019-02-22 Thread Karl Wright
elease/latest/en_US/performance-tuning.html > > > 22 Şub 2019 Cum, saat 11:34 tarihinde Karl Wright > şunu > yazdı: > > > Hi Furkan, > > > > I am not sure why Google maintains these dead links but we simply cannot > > publish doc for every release going back t

Re: ManifoldCF Website Links

2019-02-22 Thread Karl Wright
Hi Furkan, I am not sure why Google maintains these dead links but we simply cannot publish doc for every release going back to 2012. Generally we cycle releases and include the last two for each major release. We include the 1.10 docs as well as the 2.12 and 2.11 docs right now. It is

[jira] [Commented] (CONNECTORS-1584) regex documentation

2019-02-21 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774309#comment-16774309 ] Karl Wright commented on CONNECTORS-1584: - Have you subscribed to the list? Instructions

Re: Active documents

2019-02-21 Thread Karl Wright
Please tell us how you have configured your job. Is it running in continuous mode? Because if so, that is exactly how it's supposed to look -- all documents remain active forever, until you stop the job. Karl On Thu, Feb 21, 2019 at 8:18 AM Remko Mantel wrote: > Good afternoon all, > > I

[jira] [Commented] (CONNECTORS-1584) regex documentation

2019-02-21 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774105#comment-16774105 ] Karl Wright commented on CONNECTORS-1584: - Actually, it *is* user@ but so many people get

Re:

2019-02-21 Thread Karl Wright
Yes, I would separate the work of transforming documents from the work of fetching them. Karl On Wed, Feb 20, 2019 at 9:46 PM Kayak28 wrote: > Hello, Mr. Karl Wright: > > Thank you for quick response. > As you mentioned, yes I am so writing my Repository Connector to access >

Re:

2019-02-20 Thread Karl Wright
Hi Kaya, You should be able to use the existing Solr connector to index documents into Solr. You will probably need to write a Repository connector to access the REST api you describe. If the kind of scraping you need to do can be covered by the html-extractor transformer in its current form,

Re: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Karl Wright
Karl Wright wrote: > OK, you are basically having trouble with the JDBC connector, not the > basic functioning of ManifoldCF. That was not clear. > > The JDBC driver class name for MSSQL has likely been updated and we'll > need to figure out what it got changed to. > > Karl

Re: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Karl Wright
start from Tomcat Service and I configured > /etc/systemd/system/tomcat.service : > > [Unit] > > Description=Apache Tomcat Web Application Container > > After=network.target > > > > [Service] > > Type=forking > > > > Environment=JAVA_HOME= /usr/

Re: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Karl Wright
propriate place. Karl On Wed, Feb 20, 2019 at 10:58 AM Karl Wright wrote: > The question is: how are you *starting* the processes? and what process > are you seeing the error from? You should *not* need to make any changes > to the configuration if you put the jar file in place befo

Re: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Karl Wright
arl but I didn’t download manually the .jar files. > > > > I compiled MCF 2.12 and I found the jar in the lib-proprietary folder. > > > > I added in properties.xml the : > > > > > > > > I tried to : > > initialize.sh the db but I have the same

Re: Threw exception: 'Driver class not found: net.sourceforge.jtds.jdbc.Driver'

2019-02-20 Thread Karl Wright
Hi Mario, You can't just plop down a jar in a directory and have this work, because ManifoldCF requires all JDBC drivers to be in the root classpath. They are therefore built into the classpath, which should happen if you use the startup scripts. Please review the "how-to-build-and-deploy"

[jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes

2019-02-20 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772995#comment-16772995 ] Karl Wright commented on CONNECTORS-1563: - [~Subasini], we are trying to debug your setup

[jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes

2019-02-20 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772912#comment-16772912 ] Karl Wright commented on CONNECTORS-1563: - [~Subasini], the "error" is becau

[jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes

2019-02-19 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772729#comment-16772729 ] Karl Wright commented on CONNECTORS-1563: - In general in cases like this I recommend

[jira] [Commented] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-19 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771682#comment-16771682 ] Karl Wright commented on LUCENE-8696: - [~ivera], would you be willing to construct a simple test

[jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes

2019-02-18 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771663#comment-16771663 ] Karl Wright commented on CONNECTORS-1563: - Hi Subasini, Are you now Tika-extracting

[jira] [Resolved] (CONNECTORS-1584) regex documentation

2019-02-18 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1584. - Resolution: Not A Problem > regex documentat

Re: Error integrity constraint violation

2019-02-18 Thread Karl Wright
Hi Kaya, Database constraint violations, as you know, occur because you're trying to put more than one identical value into a table column that cannot have such a column. For the table in question, if you have the same class name for two different connectors, this would be what you'd expect.

[jira] [Commented] (CONNECTORS-1584) regex documentation

2019-02-18 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771462#comment-16771462 ] Karl Wright commented on CONNECTORS-1584: - The mailing list is us...@manifoldcf.apache.org

[jira] [Assigned] (LUCENE-8696) TestGeo3DPoint.testGeo3DRelations failure

2019-02-18 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned LUCENE-8696: --- Assignee: Karl Wright > TestGeo3DPoint.testGeo3DRelations fail

[jira] [Resolved] (CONNECTORS-1585) MCF Admin page shows 404 error frequently

2019-02-18 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1585. - Resolution: Cannot Reproduce > MCF Admin page shows 404 error frequen

[jira] [Commented] (CONNECTORS-1585) MCF Admin page shows 404 error frequently

2019-02-18 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16771461#comment-16771461 ] Karl Wright commented on CONNECTORS-1585: - 404 errors have nothing to do with ManifoldCF

[jira] [Resolved] (CONNECTORS-1580) Issues in documentum connector

2019-02-12 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1580. - Resolution: Won't Fix > Issues in documentum connec

[jira] [Commented] (CONNECTORS-1580) Issues in documentum connector

2019-02-12 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766799#comment-16766799 ] Karl Wright commented on CONNECTORS-1580: - You are on your own here. You are trying to use

Re: SmbAuthException thrown: Logon failure: unknown user name or bad password and SmbException tossed: NT_STATUS_NOT_FOUND

2019-02-12 Thread Karl Wright
Hi Damien, The JCIFS connector is in fact retrying most of these errors. Only after it fails repeatedly does it give up and skip the document. That's what's happening for the access exception case. For the NT_STATUS_NOT_FOUND case, it looks like we don't know what to make of that error and if

Re: Apache ManifoldCF: Get a history report for a repository connection over REST API

2019-02-12 Thread Karl Wright
Yes, query parameters in any URL go after the fixed "path" part of the URL, and are of the form ?parameter=value=value2... just like any other URL. My suspicion is that you aren't supplying the activity(s) that you want to match. The best way to figure out what activities make sense for the

Re: Error: Unexpected jobqueue status - record id X, expecting active status, saw 4 (MySQL compatible Database)

2019-02-12 Thread Karl Wright
Hi Marcus, There's a properties.xml debugging logger you can enable that will keep track of what's happening with transactions, so that when an error of this kinds is reported, information about why the situation is unexpected is dumped to the log. The logger is called "diagnostics" e.g.

[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup

2019-02-12 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766037#comment-16766037 ] Karl Wright commented on CONNECTORS-1581: - It's possible that the problem is due

[jira] [Commented] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data

2019-02-12 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765959#comment-16765959 ] Karl Wright commented on CONNECTORS-1582: - The purpose is to decide whether the document

[jira] [Commented] (CONNECTORS-1583) ManifoldCF getting hung frequently

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765760#comment-16765760 ] Karl Wright commented on CONNECTORS-1583: - How have you deployed ManifoldCF? What app

[jira] [Resolved] (CONNECTORS-1583) ManifoldCF getting hung frequently

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1583. - Resolution: Incomplete > ManifoldCF getting hung frequen

[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765178#comment-16765178 ] Karl Wright commented on CONNECTORS-1581: - Yes if the job ID doesn't show up anywhere it's

[jira] [Commented] (CONNECTORS-1580) Issues in documentum connector

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765083#comment-16765083 ] Karl Wright commented on CONNECTORS-1580: - So you modified the Documentum Connector

[jira] [Assigned] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1582: --- Assignee: Karl Wright > Unable to Crawl the Site Contents and Meta-D

[jira] [Resolved] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1582. - Resolution: Not A Problem > Unable to Crawl the Site Contents and Meta-D

[jira] [Commented] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data

2019-02-11 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765019#comment-16765019 ] Karl Wright commented on CONNECTORS-1582: - Hi [~Pavithrad], the problem is that you

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-11 Thread Karl Wright
but is there a value that allows to disable the reindex and the analyze, > for example "-1" or "0", i didn't find it in the documentation. > > Thank you > > > Le 11/02/2019 à 12:26, > Karl Wright (par Internet, dépôt > user-return-5690-daniel.lirot=developpe

Re: Job hang in aborting state for along time

2019-02-11 Thread Karl Wright
.4,10+ or 11+? > > Thanks, > Cihad Guzel > > > Karl Wright , 11 Şub 2019 Pzt, 04:01 tarihinde şunu > yazdı: > >> No, it is not normal. I expect that the MySQL transaction issues are >> causing lots of problems. >> >> Karl >> >> >> On Sun, F

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-11 Thread Karl Wright
.xml) with the line : > value="500" /> > > Is there an instruction that allows to disable the reindex requested by > manifoldcf > > thanks > > Daniel > > > Le 08/02/2019 à 16:00, > Karl Wright (par Internet, dépôt > user-return-5674-daniel.li

Re: Job hang in aborting state for along time

2019-02-10 Thread Karl Wright
What database is this? Basically, the "unexpected job status" means that the framework found something that should not have been possible, if the database had been properly enforcing ACID transactional constraints. Is this MySQL? Because if so it's known to have this problem. It also looks like

Re: Sharepoint Job - Incremental Crawling

2019-02-09 Thread Karl Wright
ter as the sharepoint > servers. Currently they are in different DCs with dedicated MPLS > connectivity. > > Thanks, > Gaurav > > On Sat, Feb 9, 2019 at 3:03 AM Karl Wright wrote: > >> The problem is not the speed of Manifold, but rather the work it has to >> do an

Re: Sharepoint Job - Incremental Crawling

2019-02-08 Thread Karl Wright
vacuum once daily. > > Would switching to a multi process configuration with manifoldcf running > on two servers give a boost. > > Thanks, > Gaurav > > On Saturday, February 9, 2019, Karl Wright wrote: > >> It does the minimum necessary. That means it can't do it in le

Re: Sharepoint Job - Incremental Crawling

2019-02-08 Thread Karl Wright
er of docs that actually change in a 30 min period won't be more than > 200. > > Being able to capture adds and updates in 30 minutes is a key business > requirement. > > Thanks, > Gaurav > > On Friday, February 8, 2019, Karl Wright wrote: > >> Hi Guarav, >> &g

[jira] [Commented] (CONNECTORS-1580) Issues in documentum connector

2019-02-08 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763725#comment-16763725 ] Karl Wright commented on CONNECTORS-1580: - {quote} The documents which have already got

Re: Sharepoint Job - Incremental Crawling

2019-02-08 Thread Karl Wright
Hi Guarav, The right way to do this is to schedule "minimal" crawls every 15 minutes (which will process only the minimum needed to deal with adds and updates), and periodically perform "full" crawls (which will also include deletions). Thanks, Karl On Fri, Feb 8, 2019 at 10:11 AM Gaurav G

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-08 Thread Karl Wright
Hello, (1) What database are you using for this? Some databases require maintenance periodically or have other heavy usage constraints. (2) Every time a query takes more than an minute to execute, it is logged, along with the query plan. You need to look at the manifoldcf log to see which

[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup

2019-02-08 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763667#comment-16763667 ] Karl Wright commented on CONNECTORS-1581: - I am pretty concerned that the database layer

[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup

2019-02-08 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763530#comment-16763530 ] Karl Wright commented on CONNECTORS-1581: - Here's the code that's throwing an NPE: {code

Re: Postgres db maintenance

2019-02-08 Thread Karl Wright
The only "old data" kept by MCF is the history information. By default it's expunged after 30 days. You can shorten the amount of time it's kept around though by setting a properties.xml parameter (need to refer to the "how-to-build-and-deploy" page for details). Karl On Fri, Feb 8, 2019 at

[jira] [Commented] (CONNECTORS-1580) Issues in documentum connector

2019-02-08 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763404#comment-16763404 ] Karl Wright commented on CONNECTORS-1580: - Hi, I can make almost no sense of this ticket

[jira] [Commented] (CONNECTORS-1579) Error when crawling a MSSQL table

2019-02-08 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763395#comment-16763395 ] Karl Wright commented on CONNECTORS-1579: - You can either check out the entire current trunk

[jira] [Resolved] (CONNECTORS-1579) Error when crawling a MSSQL table

2019-02-05 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1579. - Resolution: Fixed Fix Version/s: ManifoldCF 2.13 r1853008 > Error w

[jira] [Updated] (CONNECTORS-1579) Error when crawling a MSSQL table

2019-02-05 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-1579: Attachment: CONNECTORS-1579.patch > Error when crawling a MSSQL ta

[jira] [Commented] (CONNECTORS-1579) Error when crawling a MSSQL table

2019-02-05 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760848#comment-16760848 ] Karl Wright commented on CONNECTORS-1579: - It's a bug in the code. Whenever the JDBC

[jira] [Commented] (CONNECTORS-1579) Error when crawling a MSSQL table

2019-02-05 Thread Karl Wright (JIRA)
[ https://issues.apache.org/jira/browse/CONNECTORS-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760818#comment-16760818 ] Karl Wright commented on CONNECTORS-1579: - Hi, The proximate cause of the problem

<    6   7   8   9   10   11   12   13   14   15   >