[jira] [Created] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-04 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-638:
--

 Summary: Possible leakage of database connection handles
 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0
Reporter: Karl Wright
Assignee: Karl Wright


Reports of a leak of connection handles, shown because ManifoldCF winds up 
waiting to get a connection, and failing.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-04 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570152#comment-13570152
 ] 

Karl Wright commented on CONNECTORS-638:


r1442101 commits code that tracks allocated database handles for the purpose of 
diagnostics.


 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0
Reporter: Karl Wright
Assignee: Karl Wright

 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-05 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-638:
---

Fix Version/s: ManifoldCF 1.2
 Priority: Critical  (was: Major)
  Description: 
Reports of a leak of connection handles, shown because ManifoldCF winds up 
waiting to get a connection, and failing.

The database in question so far always seems to be PostgreSQL, FWIW, and it has 
been proven that there is a connection handle leak, so that all the crawler 
threads are eventually waiting for non-existent connection handles.



  was:
Reports of a leak of connection handles, shown because ManifoldCF winds up 
waiting to get a connection, and failing.


Affects Version/s: ManifoldCF 1.0.1

 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-05 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571297#comment-13571297
 ] 

Karl Wright commented on CONNECTORS-638:


It's worth noting that I think it is impossible for the bug here to be in 
either PostgreSQL or the PostgreSQL JDBC driver.  The only way it can occur is 
by miscounting handles that we have successfully allocated already.  So it is 
either in the pool code (which is suspicious because it was new in ManifoldCF 
1.0), or in one or two particular kinds of SQL transactions.

A careful audit of the connection pool code should thus be done.  Since I've 
already done this more than once, another pair of eyes might be useful...


 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-07 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573556#comment-13573556
 ] 

Karl Wright commented on CONNECTORS-638:


I found one possibility in the connection expiration code.  If the database 
handle close operation fails during connection expiration, and it fails by 
throwing anything other than a SQLException, we could leak handles.

r1443521 fixes this potential issue.


 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-07 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573574#comment-13573574
 ] 

Karl Wright commented on CONNECTORS-638:


The log is very puzzling.

Only seven active + leaked connections are listed.  This is out of a connection 
pool that had 100 in it at the start.  And yet we seem to be out of connections!

The other interesting point is that all seven of the handles listed are at 
exactly the same place in the code - which is doing something pretty basic: a 
single query, not in a transaction.

The code in question that manages connections through this pathway is pretty 
straightforward:

{code}
  // Grab a connection
  WrappedConnection tempConnection = 
ConnectionFactory.getConnection(jdbcUrl,jdbcDriverClass,databaseName,userName,password);
  try
  {
// Initialize the connection (for HSQLDB)
initializeConnection(tempConnection.getConnection());
return 
executeViaThread(tempConnection.getConnection(),query,params,bResults,maxResults,spec,returnLimit);
  }
  catch (ManifoldCFException e)
  {
if (e.getErrorCode() == ManifoldCFException.INTERRUPTED)
  // drop the connection object on the floor, so it cannot possibly be 
reused
  tempConnection = null;
throw e;
  }
  finally
  {
if (tempConnection != null)
  ConnectionFactory.releaseConnection(tempConnection);
  }
{code}

The only huge warning sign here is the special treatment of 
ManifoldCFException.INTERRUPTED exceptions.  These are usually thrown only when 
we're trying to shut down ManifoldCF.  I wonder if the PostgreSQL JDBC driver 
is throwing them somehow under other situations - maybe when a query takes to 
long and PostgreSQL aborts it?

If that exception IS getting thrown, it would also cause the worker thread in 
which it happened to shut itself down.  A thread dump of the agents process 
should show definitively if we've lost worker threads due to this reason.  
Erlend, can you get that, and attach it also?



 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2

 Attachments: manifoldcf.log


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-07 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573688#comment-13573688
 ] 

Karl Wright commented on CONNECTORS-638:


I've reviewed the code carefully.  I can find no place where a 
ManifoldCFException.INTERRUPTED could be thrown as a result of something the 
JDBC driver does.

I think it's critical now to see the thread dump of the agents process.  I need 
to count the remaining worker threads.  It doesn't matter when you do it, 
except that it has to be after these log messages start to appear, and before 
the agents process gets restarted.


 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2

 Attachments: manifoldcf.log


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-638:
---

Fix Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1.1

 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.1.1

 Attachments: manifoldcf.log


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-632) mvn-bootstrap shows errors, maven build fails

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-632:
---

Fix Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1.1

 mvn-bootstrap shows errors, maven build fails
 -

 Key: CONNECTORS-632
 URL: https://issues.apache.org/jira/browse/CONNECTORS-632
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1

 Attachments: maven-build-combined-service.patch


 What I did:
 1. checked out current trunk to new directory (fresh copy)
 2. run mvn-bootstrap
- there were errors:
   - timeout when downloading fonts (netwoprk is ok, maybe target server 
 problem)
   - problems with installing libraries from [checkout root]/lib directory 
 (there is no such directory), below is example:
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] Error installing artifact 'org.hsqldb:hsqldb:jar': Error installing 
 artif
 act: File C:\manifoldcf3\lib\hsqldb.jar does not exist
 3. run: mvn install
- build fails test APISanityDerbyIT:
 17818 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started 
 SelectCha
 nnelConnector@0.0.0.0:9090 STARTING
 OpenCMIS InMemory server is started listening on port 9090
 java.lang.Exception: Job reports error.
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.waitJobIna
 ctive(APISanityDerbyIT.java:558)
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.sanityChec
 k(APISanityDerbyIT.java:398)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(Framework
 Method.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCal
 lable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMe
 thod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMet
 hod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.
 java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.ja
 va:31)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4Cla
 ssRunner.java:79)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:71)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:49)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provide
 r.java:252)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4
 Provider.java:141)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider
 .java:112)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(
 ReflectionUtils.java:189)
 at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke
 (ProviderFactory.java:165)
 at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(Provi
 derFactory.java:85)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(Fork
 edBooter.java:115)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:
 75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-629) ElasticSearch connector really needs better error handling

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-629:
---

Fix Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1.1

 ElasticSearch connector really needs better error handling
 --

 Key: CONNECTORS-629
 URL: https://issues.apache.org/jira/browse/CONNECTORS-629
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 When there is a problem connecting, you get really useless exceptions like 
 this in the log:
 {code}
 ERROR 2013-01-30 13:44:15,356 (Worker thread '45') - Exception tossed:
 org.apache.manifoldcf.core.interfaces.ManifoldCFException:
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:97)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchIndex.init(ElasticSearchIndex.java:138)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.addOrReplaceDocument(ElasticSearchConnector.java:322)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
 at 
 org.apache.manifoldcf.crawler.connectors.DCTM.DCTM.processDocuments(DCTM.java:1820)
 at 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-606) ElasticSearch connector does not use background thread for http communication, and should

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-606:
---

Fix Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1.1

 ElasticSearch connector does not use background thread for http 
 communication, and should
 -

 Key: CONNECTORS-606
 URL: https://issues.apache.org/jira/browse/CONNECTORS-606
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 The Elastic Search connector communicates using httpcomponent without going 
 via a background thread.  This is a problem because any socket waits on an 
 elastic search server will block ManifoldCF agents shutdown.
 Please see the RSS connector, web connector, livelink connector, or 
 SharePoint connector for an example of proper use of background threading for 
 http communication.  The Livelink connector is the simplest.
 It is also important to never load entire documents into memory, but stream 
 them instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CONNECTORS-606) ElasticSearch connector does not use background thread for http communication, and should

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reopened CONNECTORS-606:



Will close when code ported to release branch

 ElasticSearch connector does not use background thread for http 
 communication, and should
 -

 Key: CONNECTORS-606
 URL: https://issues.apache.org/jira/browse/CONNECTORS-606
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 The Elastic Search connector communicates using httpcomponent without going 
 via a background thread.  This is a problem because any socket waits on an 
 elastic search server will block ManifoldCF agents shutdown.
 Please see the RSS connector, web connector, livelink connector, or 
 SharePoint connector for an example of proper use of background threading for 
 http communication.  The Livelink connector is the simplest.
 It is also important to never load entire documents into memory, but stream 
 them instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CONNECTORS-634) ElasticSearch now returns 201 http codes on success, not just 200 codes

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reopened CONNECTORS-634:



will close when code ported to release branch

 ElasticSearch now returns 201 http codes on success, not just 200 codes
 ---

 Key: CONNECTORS-634
 URL: https://issues.apache.org/jira/browse/CONNECTORS-634
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 Elastic Search now returns httpcode 201 for successful indexing events.  The 
 connector needs to accept these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-637) NPE from Elastic Search connector

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-637:
---

Fix Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1.1

 NPE from Elastic Search connector
 -

 Key: CONNECTORS-637
 URL: https://issues.apache.org/jira/browse/CONNECTORS-637
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 NPE as follows:
 {code}
 FATAL 2013-02-01 14:32:38,255 (Worker thread '5') - Error tossed: null
 java.lang.NullPointerException
 at java.util.TreeMap.getEntry(TreeMap.java:324)
 at java.util.TreeMap.containsKey(TreeMap.java:209)
 at java.util.TreeSet.contains(TreeSet.java:217)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchSpecs.checkMimeType(ElasticSearchSpecs.java:164)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.checkMimeTypeIndexable(ElasticSearchConnector.java:333)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.checkMimeTypeIndexable(IncrementalIngester.java:212)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.checkMimeTypeIndexable(WorkerThread.java:2091)
 at 
 org.apache.manifoldcf.crawler.connectors.DCTM.DCTM.processDocuments(DCTM.java:1811)
 at 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:556)
 {code}
 Apparently the DocumentumConnector is passing in null mime types, and the 
 elastic search connector doesn't like it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CONNECTORS-629) ElasticSearch connector really needs better error handling

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reopened CONNECTORS-629:



will close when code ported to release branch

 ElasticSearch connector really needs better error handling
 --

 Key: CONNECTORS-629
 URL: https://issues.apache.org/jira/browse/CONNECTORS-629
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 When there is a problem connecting, you get really useless exceptions like 
 this in the log:
 {code}
 ERROR 2013-01-30 13:44:15,356 (Worker thread '45') - Exception tossed:
 org.apache.manifoldcf.core.interfaces.ManifoldCFException:
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:97)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchIndex.init(ElasticSearchIndex.java:138)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.addOrReplaceDocument(ElasticSearchConnector.java:322)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
 at 
 org.apache.manifoldcf.crawler.connectors.DCTM.DCTM.processDocuments(DCTM.java:1820)
 at 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-638.


Resolution: Fixed

 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.1.1

 Attachments: manifoldcf.log


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-638) Possible leakage of database connection handles

2013-02-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574497#comment-13574497
 ] 

Karl Wright commented on CONNECTORS-638:


r1444015 (release branch)


 Possible leakage of database connection handles
 ---

 Key: CONNECTORS-638
 URL: https://issues.apache.org/jira/browse/CONNECTORS-638
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.1.1

 Attachments: manifoldcf.log


 Reports of a leak of connection handles, shown because ManifoldCF winds up 
 waiting to get a connection, and failing.
 The database in question so far always seems to be PostgreSQL, FWIW, and it 
 has been proven that there is a connection handle leak, so that all the 
 crawler threads are eventually waiting for non-existent connection handles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CONNECTORS-632) mvn-bootstrap shows errors, maven build fails

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reopened CONNECTORS-632:



will close when code is pulled up to branch

 mvn-bootstrap shows errors, maven build fails
 -

 Key: CONNECTORS-632
 URL: https://issues.apache.org/jira/browse/CONNECTORS-632
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1

 Attachments: maven-build-combined-service.patch


 What I did:
 1. checked out current trunk to new directory (fresh copy)
 2. run mvn-bootstrap
- there were errors:
   - timeout when downloading fonts (netwoprk is ok, maybe target server 
 problem)
   - problems with installing libraries from [checkout root]/lib directory 
 (there is no such directory), below is example:
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] Error installing artifact 'org.hsqldb:hsqldb:jar': Error installing 
 artif
 act: File C:\manifoldcf3\lib\hsqldb.jar does not exist
 3. run: mvn install
- build fails test APISanityDerbyIT:
 17818 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started 
 SelectCha
 nnelConnector@0.0.0.0:9090 STARTING
 OpenCMIS InMemory server is started listening on port 9090
 java.lang.Exception: Job reports error.
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.waitJobIna
 ctive(APISanityDerbyIT.java:558)
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.sanityChec
 k(APISanityDerbyIT.java:398)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(Framework
 Method.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCal
 lable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMe
 thod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMet
 hod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.
 java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.ja
 va:31)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4Cla
 ssRunner.java:79)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:71)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:49)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provide
 r.java:252)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4
 Provider.java:141)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider
 .java:112)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(
 ReflectionUtils.java:189)
 at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke
 (ProviderFactory.java:165)
 at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(Provi
 derFactory.java:85)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(Fork
 edBooter.java:115)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:
 75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-636) Maven build of combined war does not build a fully functioning war

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-636:
---

Fix Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1.1
Affects Version/s: (was: ManifoldCF 1.2)
   ManifoldCF 1.1

 Maven build of combined war does not build a fully functioning war
 --

 Key: CONNECTORS-636
 URL: https://issues.apache.org/jira/browse/CONNECTORS-636
 Project: ManifoldCF
  Issue Type: Bug
  Components: Build
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 The maven build of the mcf-combined-service war does not actually produce a 
 fully-functioning war.  If you note the exceptions thrown when the tests run, 
 the TLDs for the crawler-ui part do not have a corresponding standard tag 
 library in the war's jar list.  There may be other problems too; someone 
 should look at this in depth.
 FWIW, the combined war in the ant build brings together webapp components 
 from the crawler-ui via a file copy into a build area.  This was done so that 
 there would be only one set of JSPs, but unfortunately this may be a bit of a 
 challenge to do with Maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CONNECTORS-636) Maven build of combined war does not build a fully functioning war

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-636:
--

Assignee: Karl Wright  (was: Maciej Lizewski)

 Maven build of combined war does not build a fully functioning war
 --

 Key: CONNECTORS-636
 URL: https://issues.apache.org/jira/browse/CONNECTORS-636
 Project: ManifoldCF
  Issue Type: Bug
  Components: Build
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 The maven build of the mcf-combined-service war does not actually produce a 
 fully-functioning war.  If you note the exceptions thrown when the tests run, 
 the TLDs for the crawler-ui part do not have a corresponding standard tag 
 library in the war's jar list.  There may be other problems too; someone 
 should look at this in depth.
 FWIW, the combined war in the ant build brings together webapp components 
 from the crawler-ui via a file copy into a build area.  This was done so that 
 there would be only one set of JSPs, but unfortunately this may be a bit of a 
 challenge to do with Maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-636) Maven build of combined war does not build a fully functioning war

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-636.


Resolution: Fixed

 Maven build of combined war does not build a fully functioning war
 --

 Key: CONNECTORS-636
 URL: https://issues.apache.org/jira/browse/CONNECTORS-636
 Project: ManifoldCF
  Issue Type: Bug
  Components: Build
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 The maven build of the mcf-combined-service war does not actually produce a 
 fully-functioning war.  If you note the exceptions thrown when the tests run, 
 the TLDs for the crawler-ui part do not have a corresponding standard tag 
 library in the war's jar list.  There may be other problems too; someone 
 should look at this in depth.
 FWIW, the combined war in the ant build brings together webapp components 
 from the crawler-ui via a file copy into a build area.  This was done so that 
 there would be only one set of JSPs, but unfortunately this may be a bit of a 
 challenge to do with Maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-636) Maven build of combined war does not build a fully functioning war

2013-02-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574517#comment-13574517
 ] 

Karl Wright commented on CONNECTORS-636:


r1444028 (release branch)


 Maven build of combined war does not build a fully functioning war
 --

 Key: CONNECTORS-636
 URL: https://issues.apache.org/jira/browse/CONNECTORS-636
 Project: ManifoldCF
  Issue Type: Bug
  Components: Build
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 The maven build of the mcf-combined-service war does not actually produce a 
 fully-functioning war.  If you note the exceptions thrown when the tests run, 
 the TLDs for the crawler-ui part do not have a corresponding standard tag 
 library in the war's jar list.  There may be other problems too; someone 
 should look at this in depth.
 FWIW, the combined war in the ant build brings together webapp components 
 from the crawler-ui via a file copy into a build area.  This was done so that 
 there would be only one set of JSPs, but unfortunately this may be a bit of a 
 challenge to do with Maven.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-632) mvn-bootstrap shows errors, maven build fails

2013-02-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574518#comment-13574518
 ] 

Karl Wright commented on CONNECTORS-632:


r1444028 (release branch)


 mvn-bootstrap shows errors, maven build fails
 -

 Key: CONNECTORS-632
 URL: https://issues.apache.org/jira/browse/CONNECTORS-632
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1

 Attachments: maven-build-combined-service.patch


 What I did:
 1. checked out current trunk to new directory (fresh copy)
 2. run mvn-bootstrap
- there were errors:
   - timeout when downloading fonts (netwoprk is ok, maybe target server 
 problem)
   - problems with installing libraries from [checkout root]/lib directory 
 (there is no such directory), below is example:
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] Error installing artifact 'org.hsqldb:hsqldb:jar': Error installing 
 artif
 act: File C:\manifoldcf3\lib\hsqldb.jar does not exist
 3. run: mvn install
- build fails test APISanityDerbyIT:
 17818 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started 
 SelectCha
 nnelConnector@0.0.0.0:9090 STARTING
 OpenCMIS InMemory server is started listening on port 9090
 java.lang.Exception: Job reports error.
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.waitJobIna
 ctive(APISanityDerbyIT.java:558)
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.sanityChec
 k(APISanityDerbyIT.java:398)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(Framework
 Method.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCal
 lable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMe
 thod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMet
 hod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.
 java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.ja
 va:31)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4Cla
 ssRunner.java:79)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:71)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:49)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provide
 r.java:252)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4
 Provider.java:141)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider
 .java:112)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(
 ReflectionUtils.java:189)
 at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke
 (ProviderFactory.java:165)
 at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(Provi
 derFactory.java:85)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(Fork
 edBooter.java:115)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:
 75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-632) mvn-bootstrap shows errors, maven build fails

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-632.


Resolution: Fixed

 mvn-bootstrap shows errors, maven build fails
 -

 Key: CONNECTORS-632
 URL: https://issues.apache.org/jira/browse/CONNECTORS-632
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1

 Attachments: maven-build-combined-service.patch


 What I did:
 1. checked out current trunk to new directory (fresh copy)
 2. run mvn-bootstrap
- there were errors:
   - timeout when downloading fonts (netwoprk is ok, maybe target server 
 problem)
   - problems with installing libraries from [checkout root]/lib directory 
 (there is no such directory), below is example:
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] Error installing artifact 'org.hsqldb:hsqldb:jar': Error installing 
 artif
 act: File C:\manifoldcf3\lib\hsqldb.jar does not exist
 3. run: mvn install
- build fails test APISanityDerbyIT:
 17818 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started 
 SelectCha
 nnelConnector@0.0.0.0:9090 STARTING
 OpenCMIS InMemory server is started listening on port 9090
 java.lang.Exception: Job reports error.
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.waitJobIna
 ctive(APISanityDerbyIT.java:558)
 at 
 org.apache.manifoldcf.elasticsearch_tests.APISanityDerbyIT.sanityChec
 k(APISanityDerbyIT.java:398)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(Framework
 Method.java:44)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCal
 lable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMe
 thod.java:41)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMet
 hod.java:20)
 at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.
 java:28)
 at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.ja
 va:31)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4Cla
 ssRunner.java:79)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:71)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRun
 ner.java:49)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provide
 r.java:252)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4
 Provider.java:141)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider
 .java:112)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(
 ReflectionUtils.java:189)
 at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke
 (ProviderFactory.java:165)
 at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(Provi
 derFactory.java:85)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(Fork
 edBooter.java:115)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:
 75)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-606) ElasticSearch connector does not use background thread for http communication, and should

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-606.


Resolution: Fixed

r1444089 (release branch)


 ElasticSearch connector does not use background thread for http 
 communication, and should
 -

 Key: CONNECTORS-606
 URL: https://issues.apache.org/jira/browse/CONNECTORS-606
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 The Elastic Search connector communicates using httpcomponent without going 
 via a background thread.  This is a problem because any socket waits on an 
 elastic search server will block ManifoldCF agents shutdown.
 Please see the RSS connector, web connector, livelink connector, or 
 SharePoint connector for an example of proper use of background threading for 
 http communication.  The Livelink connector is the simplest.
 It is also important to never load entire documents into memory, but stream 
 them instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-629) ElasticSearch connector really needs better error handling

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-629.


Resolution: Fixed

r1444089 (release branch)


 ElasticSearch connector really needs better error handling
 --

 Key: CONNECTORS-629
 URL: https://issues.apache.org/jira/browse/CONNECTORS-629
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 When there is a problem connecting, you get really useless exceptions like 
 this in the log:
 {code}
 ERROR 2013-01-30 13:44:15,356 (Worker thread '45') - Exception tossed:
 org.apache.manifoldcf.core.interfaces.ManifoldCFException:
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnection.call(ElasticSearchConnection.java:97)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchIndex.init(ElasticSearchIndex.java:138)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.addOrReplaceDocument(ElasticSearchConnector.java:322)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1579)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:504)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:370)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1652)
 at 
 org.apache.manifoldcf.crawler.connectors.DCTM.DCTM.processDocuments(DCTM.java:1820)
 at 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:551)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-637) NPE from Elastic Search connector

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-637.


Resolution: Fixed

r1444089 (release branch)

 NPE from Elastic Search connector
 -

 Key: CONNECTORS-637
 URL: https://issues.apache.org/jira/browse/CONNECTORS-637
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 NPE as follows:
 {code}
 FATAL 2013-02-01 14:32:38,255 (Worker thread '5') - Error tossed: null
 java.lang.NullPointerException
 at java.util.TreeMap.getEntry(TreeMap.java:324)
 at java.util.TreeMap.containsKey(TreeMap.java:209)
 at java.util.TreeSet.contains(TreeSet.java:217)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchSpecs.checkMimeType(ElasticSearchSpecs.java:164)
 at 
 org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector.checkMimeTypeIndexable(ElasticSearchConnector.java:333)
 at 
 org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.checkMimeTypeIndexable(IncrementalIngester.java:212)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.checkMimeTypeIndexable(WorkerThread.java:2091)
 at 
 org.apache.manifoldcf.crawler.connectors.DCTM.DCTM.processDocuments(DCTM.java:1811)
 at 
 org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
 at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:556)
 {code}
 Apparently the DocumentumConnector is passing in null mime types, and the 
 elastic search connector doesn't like it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-634) ElasticSearch now returns 201 http codes on success, not just 200 codes

2013-02-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-634.


Resolution: Fixed

r1444089 (release branch)


 ElasticSearch now returns 201 http codes on success, not just 200 codes
 ---

 Key: CONNECTORS-634
 URL: https://issues.apache.org/jira/browse/CONNECTORS-634
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.1.1


 Elastic Search now returns httpcode 201 for successful indexing events.  The 
 connector needs to accept these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-639) maven execute of jetty-runner fails

2013-02-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575273#comment-13575273
 ] 

Karl Wright commented on CONNECTORS-639:


Hmm.

I pulled this up to the release branch, and now I get the following when I try 
mvn clean install there:

{code}
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.
3.2:compile (default-compile) on project mcf-jettyrunner: Compilation failure: C
ompilation failure:
[ERROR] \wip\mcf\release-1.1-branch\framework\jetty-runner\src\main\java\org\apa
che\manifoldcf\jettyrunner\ManifoldCFCombinedJettyRunner.java:[29,25] package ja
vax.servlet.http does not exist
[ERROR] \wip\mcf\release-1.1-branch\framework\jetty-runner\src\main\java\org\apa
che\manifoldcf\jettyrunner\ManifoldCFCombinedJettyRunner.java:[30,25] package ja
vax.servlet.http does not exist
[ERROR] \wip\mcf\release-1.1-branch\framework\jetty-runner\src\main\java\org\apa
che\manifoldcf\jettyrunner\ManifoldCFCombinedJettyRunner.java:[31,25] package ja
vax.servlet.http does not exist
[ERROR] \wip\mcf\release-1.1-branch\framework\jetty-runner\src\main\java\org\apa
che\manifoldcf\jettyrunner\ManifoldCFJettyRunner.java:[29,25] package javax.serv
let.http does not exist
[ERROR] \wip\mcf\release-1.1-branch\framework\jetty-runner\src\main\java\org\apa
che\manifoldcf\jettyrunner\ManifoldCFJettyRunner.java:[30,25] package javax.serv
let.http does not exist
[ERROR] \wip\mcf\release-1.1-branch\framework\jetty-runner\src\main\java\org\apa
che\manifoldcf\jettyrunner\ManifoldCFJettyRunner.java:[31,25] package javax.serv
let.http does not exist
[ERROR] - [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
ch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please rea
d the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureExc
eption
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
{code}


 maven execute of jetty-runner fails
 ---

 Key: CONNECTORS-639
 URL: https://issues.apache.org/jira/browse/CONNECTORS-639
 Project: ManifoldCF
  Issue Type: Bug
Reporter: Maciej Lizewski
Assignee: Maciej Lizewski
 Fix For: ManifoldCF 1.1.1


 When trying to run jetty-runner with:
 mvn exec:exec
 inside jetty-runner directory, exception is thrown:
 Exception in thread main java.lang.NoClassDefFoundError: 
 javax/servlet/http/HttpServletRequest

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-633) Look into using simple tag parser, from Web Connector, for RSS parsing needs

2013-02-10 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575565#comment-13575565
 ] 

Karl Wright commented on CONNECTORS-633:


r1444628


 Look into using simple tag parser, from Web Connector, for RSS parsing needs
 

 Key: CONNECTORS-633
 URL: https://issues.apache.org/jira/browse/CONNECTORS-633
 Project: ManifoldCF
  Issue Type: Task
  Components: RSS connector, Web connector
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Our last major custom dependency is on a hacked version of xerces.  This was 
 done initially to fix a memory leak, and to allow parsing of sloppy RSS 
 feeds.  We might be able to eliminate this if we verify that the memory leak 
 has been fixed in a more modern xerces, and we move towards using our 
 homegrown sloppy tag parser developed for the Web connector, for RSS feed 
 processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-633) Look into using simple tag parser, from Web Connector, for RSS parsing needs

2013-02-10 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-633.


Resolution: Fixed

 Look into using simple tag parser, from Web Connector, for RSS parsing needs
 

 Key: CONNECTORS-633
 URL: https://issues.apache.org/jira/browse/CONNECTORS-633
 Project: ManifoldCF
  Issue Type: Task
  Components: RSS connector, Web connector
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Our last major custom dependency is on a hacked version of xerces.  This was 
 done initially to fix a memory leak, and to allow parsing of sloppy RSS 
 feeds.  We might be able to eliminate this if we verify that the memory leak 
 has been fixed in a more modern xerces, and we move towards using our 
 homegrown sloppy tag parser developed for the Web connector, for RSS feed 
 processing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-642) Need an ElasticSearch plugin for enforcing ManifoldCF security

2013-02-11 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-642:
--

 Summary: Need an ElasticSearch plugin for enforcing ManifoldCF 
security
 Key: CONNECTORS-642
 URL: https://issues.apache.org/jira/browse/CONNECTORS-642
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


ElasticSearch is becoming popular and we need to support it fully.  In order 
for that to happen, we really need an ElasticSearch ManifoldCF plugin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-643) ElasticSearch delete function always returns ERROR status on deletion

2013-02-11 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-643:
--

 Summary: ElasticSearch delete function always returns ERROR status 
on deletion
 Key: CONNECTORS-643
 URL: https://issues.apache.org/jira/browse/CONNECTORS-643
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


When viewed in the Simple History, all deleted documents from ElasticSearch 
return ERROR with no description.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CONNECTORS-644) Restart of job causes Repeated service interruptions

2013-02-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-644:
--

Assignee: Karl Wright

 Restart of job causes Repeated service interruptions
 

 Key: CONNECTORS-644
 URL: https://issues.apache.org/jira/browse/CONNECTORS-644
 Project: ManifoldCF
  Issue Type: Bug
  Components: Web connector
Reporter: Erlend Garåsen
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2

 Attachments: manifoldcf.log


 If you start a job defined for a web crawler and later click the restart 
 link, it will actually never restart but end with the following: Error: 
 Repeated service interruptions - failure getting document version
 This problem happens regardless of which output connector is defined. This 
 behaviour can easily be reproduced by defining a null output connector, a web 
 crawler and a job. It will happen on any servlet container, even by running 
 Jetty within the example folder. A manifoldcf log is attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-644) Restart of job causes Repeated service interruptions

2013-02-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576918#comment-13576918
 ] 

Karl Wright commented on CONNECTORS-644:


I think the right way to fix this is to introduce a new variant of the 
ServiceInterruption which can be used to signal job abort without being treated 
as a fatal error.


 Restart of job causes Repeated service interruptions
 

 Key: CONNECTORS-644
 URL: https://issues.apache.org/jira/browse/CONNECTORS-644
 Project: ManifoldCF
  Issue Type: Bug
  Components: Web connector
Reporter: Erlend Garåsen
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2

 Attachments: manifoldcf.log


 If you start a job defined for a web crawler and later click the restart 
 link, it will actually never restart but end with the following: Error: 
 Repeated service interruptions - failure getting document version
 This problem happens regardless of which output connector is defined. This 
 behaviour can easily be reproduced by defining a null output connector, a web 
 crawler and a job. It will happen on any servlet container, even by running 
 Jetty within the example folder. A manifoldcf log is attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-644) Restart of job causes Repeated service interruptions

2013-02-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-644.


Resolution: Fixed

 Restart of job causes Repeated service interruptions
 

 Key: CONNECTORS-644
 URL: https://issues.apache.org/jira/browse/CONNECTORS-644
 Project: ManifoldCF
  Issue Type: Bug
  Components: Web connector
Reporter: Erlend Garåsen
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2

 Attachments: manifoldcf.log


 If you start a job defined for a web crawler and later click the restart 
 link, it will actually never restart but end with the following: Error: 
 Repeated service interruptions - failure getting document version
 This problem happens regardless of which output connector is defined. This 
 behaviour can easily be reproduced by defining a null output connector, a web 
 crawler and a job. It will happen on any servlet container, even by running 
 Jetty within the example folder. A manifoldcf log is attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CONNECTORS-645) forced metadata causes NPE on document deletion

2013-02-13 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-645:
--

Assignee: Karl Wright

 forced metadata causes NPE on document deletion
 ---

 Key: CONNECTORS-645
 URL: https://issues.apache.org/jira/browse/CONNECTORS-645
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Reporter: Maciej Lizewski
Assignee: Karl Wright
Priority: Blocker

 WorkerThread::deleteDocument
 {
   if (version.length() == 0)
 deleteDocument(documentIdentifier);
   else
 ingestDocument(documentIdentifier,version,null,null);
 }
 look at the 'else' part - it calls ingestDocument with NULL as document 
 paramter. Then look at ingestDocument:
   // Modify the repository document with forced parameters.
   for (String paramName : forcedMetadata.keySet())
   {
 SetString values = forcedMetadata.get(paramName);
 String[] paramValues = new String[values.size()];
 int j = 0;
 for (String value : values)
 {
   paramValues[j++] = value;
 }
 data.addField(paramName,paramValues);
   }
 it tries to set forced metadata even if 'data' (document) is null...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-645) forced metadata causes NPE on document deletion

2013-02-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577550#comment-13577550
 ] 

Karl Wright commented on CONNECTORS-645:


r1445585


 forced metadata causes NPE on document deletion
 ---

 Key: CONNECTORS-645
 URL: https://issues.apache.org/jira/browse/CONNECTORS-645
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Reporter: Maciej Lizewski
Assignee: Karl Wright
Priority: Blocker

   //in WorkerThread::deleteDocument
   {
   ...
   if (version.length() == 0)
 deleteDocument(documentIdentifier);
   else
 ingestDocument(documentIdentifier,version,null,null);
   }
 look at the 'else' part - it calls ingestDocument with NULL as document 
 paramter. Then look at ingestDocument:
   // Modify the repository document with forced parameters.
   for (String paramName : forcedMetadata.keySet())
   {
 SetString values = forcedMetadata.get(paramName);
 String[] paramValues = new String[values.size()];
 int j = 0;
 for (String value : values)
 {
   paramValues[j++] = value;
 }
 data.addField(paramName,paramValues);
   }
 it tries to set forced metadata even if 'data' (document) is null...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-645) forced metadata causes NPE on document deletion

2013-02-13 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-645.


Resolution: Fixed

 forced metadata causes NPE on document deletion
 ---

 Key: CONNECTORS-645
 URL: https://issues.apache.org/jira/browse/CONNECTORS-645
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Reporter: Maciej Lizewski
Assignee: Karl Wright
Priority: Blocker

   //in WorkerThread::deleteDocument
   {
   ...
   if (version.length() == 0)
 deleteDocument(documentIdentifier);
   else
 ingestDocument(documentIdentifier,version,null,null);
   }
 look at the 'else' part - it calls ingestDocument with NULL as document 
 paramter. Then look at ingestDocument:
   // Modify the repository document with forced parameters.
   for (String paramName : forcedMetadata.keySet())
   {
 SetString values = forcedMetadata.get(paramName);
 String[] paramValues = new String[values.size()];
 int j = 0;
 for (String value : values)
 {
   paramValues[j++] = value;
 }
 data.addField(paramName,paramValues);
   }
 it tries to set forced metadata even if 'data' (document) is null...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-645) forced metadata causes NPE on document deletion

2013-02-13 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-645:
---

Fix Version/s: ManifoldCF 1.2
Affects Version/s: ManifoldCF 1.2

 forced metadata causes NPE on document deletion
 ---

 Key: CONNECTORS-645
 URL: https://issues.apache.org/jira/browse/CONNECTORS-645
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.2
Reporter: Maciej Lizewski
Assignee: Karl Wright
Priority: Blocker
 Fix For: ManifoldCF 1.2


   //in WorkerThread::deleteDocument
   {
   ...
   if (version.length() == 0)
 deleteDocument(documentIdentifier);
   else
 ingestDocument(documentIdentifier,version,null,null);
   }
 look at the 'else' part - it calls ingestDocument with NULL as document 
 paramter. Then look at ingestDocument:
   // Modify the repository document with forced parameters.
   for (String paramName : forcedMetadata.keySet())
   {
 SetString values = forcedMetadata.get(paramName);
 String[] paramValues = new String[values.size()];
 int j = 0;
 for (String value : values)
 {
   paramValues[j++] = value;
 }
 data.addField(paramName,paramValues);
   }
 it tries to set forced metadata even if 'data' (document) is null...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-646) Elastic Search Connector sending unencoded JSON to ElasticSearch

2013-02-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13578043#comment-13578043
 ] 

Karl Wright commented on CONNECTORS-646:


I believe this is already fixed on trunk.  See CONNECTORS-641.


 Elastic Search Connector sending unencoded JSON to ElasticSearch
 

 Key: CONNECTORS-646
 URL: https://issues.apache.org/jira/browse/CONNECTORS-646
 Project: ManifoldCF
  Issue Type: Bug
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Tony Edgin

 A website I'm trying to crawl puts ETag parameters in its HTTP response 
 headers.  The ETags have values that are text surrounded by quotations ().  
 When I use ManifoldCF to crawl the website and output to Elastic Search, 
 Elastic Search logs an exception.  It appears the JSON parser reads an actual 
 quote mark  instead of an escaped one \.  The parser interprets the quote 
 mark as the end of the value instead of part of the value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-599) Derby stalls, does not perform well, on multi-threaded tests

2013-02-15 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579046#comment-13579046
 ] 

Karl Wright commented on CONNECTORS-599:


I'm told that the fix will be in the next 10.9.x release, which is likely due 
out before our target 1.2 release date.


 Derby stalls, does not perform well, on multi-threaded tests
 

 Key: CONNECTORS-599
 URL: https://issues.apache.org/jira/browse/CONNECTORS-599
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 1.0, ManifoldCF 1.0.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Derby has been problematic for a while.  On one particular test it is easy to 
 see it without fail: ant run-rss-tests-derby. I've opened a ticket for the 
 Derby project to track the problem, but there appears to be little interest 
 in addressing it.  DERBY-6011.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-647) Solr connector should pass filename to update/extract handler

2013-02-17 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-647:
---

  Component/s: (was: Solr-4.x-component)
   Lucene/SOLR connector
Fix Version/s: ManifoldCF 1.2
 Assignee: Karl Wright
Affects Version/s: ManifoldCF 1.1.1

 Solr connector should pass filename to update/extract handler
 -

 Key: CONNECTORS-647
 URL: https://issues.apache.org/jira/browse/CONNECTORS-647
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 According to ExtractingRequestHandler documentation 
 (http://wiki.apache.org/solr/ExtractingRequestHandler) there can be 
 additional parameter resource.name holding filename to help tika guess 
 mime-type:
 resource.name=File Name - The optional name of the file. Tika can use it as 
 a hint for detecting mime type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-647) Solr connector should pass filename to update/extract handler

2013-02-17 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580273#comment-13580273
 ] 

Karl Wright commented on CONNECTORS-647:


I have no objection to adding this.  However, the stream_name and stream_length 
parameters also do the same thing, or used to, so this may not have any effect.


 Solr connector should pass filename to update/extract handler
 -

 Key: CONNECTORS-647
 URL: https://issues.apache.org/jira/browse/CONNECTORS-647
 Project: ManifoldCF
  Issue Type: Bug
  Components: Solr-4.x-component
Reporter: Maciej Lizewski

 According to ExtractingRequestHandler documentation 
 (http://wiki.apache.org/solr/ExtractingRequestHandler) there can be 
 additional parameter resource.name holding filename to help tika guess 
 mime-type:
 resource.name=File Name - The optional name of the file. Tika can use it as 
 a hint for detecting mime type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-647) Solr connector should pass filename to update/extract handler

2013-02-17 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580275#comment-13580275
 ] 

Karl Wright commented on CONNECTORS-647:


r1447066


 Solr connector should pass filename to update/extract handler
 -

 Key: CONNECTORS-647
 URL: https://issues.apache.org/jira/browse/CONNECTORS-647
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 According to ExtractingRequestHandler documentation 
 (http://wiki.apache.org/solr/ExtractingRequestHandler) there can be 
 additional parameter resource.name holding filename to help tika guess 
 mime-type:
 resource.name=File Name - The optional name of the file. Tika can use it as 
 a hint for detecting mime type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-648) Wars fail to deploy in Tomcat 7

2013-02-17 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580354#comment-13580354
 ] 

Karl Wright commented on CONNECTORS-648:


Hmm - had no reports of difficulty for quite some time having to do with this.  
Have tested on jetty and Resin fairly extensively.

I'll try the patch locally here to see if there are any Jetty issues.  If not, 
will try Resin and will commit if Resin too is happy.


 Wars fail to deploy in Tomcat 7
 ---

 Key: CONNECTORS-648
 URL: https://issues.apache.org/jira/browse/CONNECTORS-648
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1.1
Reporter: Jan Høydahl
  Labels: Tomcat7
 Attachments: CONNECTORS-648.patch


 This is a followup on CONNECTORS-568, which I believe is still not solved.
 I deploy mcf-combined-service.war in Tomcat 7.0.35, and I get the error
 {noformat}
 IllegalArgumentException: taglib definition not consistent with specification 
 version
 {noformat}
 Then I unpack the WAR and removes the whole {{jsp-config}} and 
 {{taglibs}} from it, as adviced in 
 http://wiki.metawerx.net/wiki/RemovingTaglibFromWeb.xml
 Restarting, and successful startup, but trying to hit 
 http://localhost:8080/mcf/ brings this error from ./index.jsp line 1
 {noformat}
 org.apache.jasper.JasperException: 
 The absolute uri: http://java.sun.com/jstl/core cannot be resolved 
 in either web.xml or the jar files deployed with this application
 {noformat}
 So then I open the JSPs and see that the taglib refs in {{adminDefaults.jsp}} 
 and {{adminHeaders.jsp}} refer to {{http://java.sun.com/jstl/core}} which is 
 wrong. According to http://stackoverflow.com/tags/jstl/info since JSTL1.1 the 
 path must have /jsp/ in the path. Changing that and restarting brings up the 
 GUI just fine!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CONNECTORS-648) Wars fail to deploy in Tomcat 7

2013-02-17 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-648:
---

Fix Version/s: ManifoldCF 1.2

 Wars fail to deploy in Tomcat 7
 ---

 Key: CONNECTORS-648
 URL: https://issues.apache.org/jira/browse/CONNECTORS-648
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1.1
Reporter: Jan Høydahl
Assignee: Karl Wright
  Labels: Tomcat7
 Fix For: ManifoldCF 1.2

 Attachments: CONNECTORS-648.patch


 This is a followup on CONNECTORS-568, which I believe is still not solved.
 I deploy mcf-combined-service.war in Tomcat 7.0.35, and I get the error
 {noformat}
 IllegalArgumentException: taglib definition not consistent with specification 
 version
 {noformat}
 Then I unpack the WAR and removes the whole {{jsp-config}} and 
 {{taglibs}} from it, as adviced in 
 http://wiki.metawerx.net/wiki/RemovingTaglibFromWeb.xml
 Restarting, and successful startup, but trying to hit 
 http://localhost:8080/mcf/ brings this error from ./index.jsp line 1
 {noformat}
 org.apache.jasper.JasperException: 
 The absolute uri: http://java.sun.com/jstl/core cannot be resolved 
 in either web.xml or the jar files deployed with this application
 {noformat}
 So then I open the JSPs and see that the taglib refs in {{adminDefaults.jsp}} 
 and {{adminHeaders.jsp}} refer to {{http://java.sun.com/jstl/core}} which is 
 wrong. According to http://stackoverflow.com/tags/jstl/info since JSTL1.1 the 
 path must have /jsp/ in the path. Changing that and restarting brings up the 
 GUI just fine!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-623) stream_size and stream_name can't be sent

2013-02-17 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580460#comment-13580460
 ] 

Karl Wright commented on CONNECTORS-623:


Actually, it should be possible to get the necessary information with 
HttpComponents HttpClient wire logging in trunk.  But in 1.0.1 wire logging 
won't work - you have to use WireShark there.


 stream_size and stream_name can't be sent
 -

 Key: CONNECTORS-623
 URL: https://issues.apache.org/jira/browse/CONNECTORS-623
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1
Reporter: Shinichiro Abe
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 1.1, ManifoldCF 1.2

 Attachments: CONNECTORS-623.patch


 These metadata can be sent to Solr in MCF 1.0.1 but can not be sent in MCF 
 1.1.
 I think it is because of SolrJ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CONNECTORS-623) stream_size and stream_name can't be sent

2013-02-17 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580460#comment-13580460
 ] 

Karl Wright edited comment on CONNECTORS-623 at 2/18/13 6:59 AM:
-

Actually, it should be possible to get the necessary information with 
HttpComponents HttpClient wire logging in trunk.  But in 1.0.1 wire logging 
won't work - you have to use WireShark there.

Looking at the 1.0.1 code, the way the info is transmitted is as follows:

{code}
String value = Content-Disposition: form-data;
if (name != null)
  value += ; name=\+name+\;
if (fileName != null)
  value += ; filename=\+fileName+\;
value += \r\n;
byte[] tmp = value.getBytes(UTF-8);
rval += tmp.length;
tmp = (Content-Type: +contentType+\r\n\r\n).getBytes(ASCII);
{code}

... which means that there are two headers in the multipart form section of the 
document:

{code}
Content-Disposition: form-data; name=name; filename=filename
Content-Type: content-type
{code}

... where, for the content, name is myfile, and filename is the file name.

If this is not what the multipart form poster in 1.1.1 is actually doing, I 
should be able to fix it to do what we need.  But I'd like first to understand 
what it's currently doing before I start changing things, because if it is 
already working this way then the problem is that Solr changed too.




  was (Author: kwri...@metacarta.com):
Actually, it should be possible to get the necessary information with 
HttpComponents HttpClient wire logging in trunk.  But in 1.0.1 wire logging 
won't work - you have to use WireShark there.

  
 stream_size and stream_name can't be sent
 -

 Key: CONNECTORS-623
 URL: https://issues.apache.org/jira/browse/CONNECTORS-623
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1
Reporter: Shinichiro Abe
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 1.1, ManifoldCF 1.2

 Attachments: CONNECTORS-623.patch


 These metadata can be sent to Solr in MCF 1.0.1 but can not be sent in MCF 
 1.1.
 I think it is because of SolrJ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-623) stream_size and stream_name can't be sent

2013-02-18 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580869#comment-13580869
 ] 

Karl Wright commented on CONNECTORS-623:


r1447526 fixes a problem with deletes and commits.  Now the Solr integration 
test passes too.


 stream_size and stream_name can't be sent
 -

 Key: CONNECTORS-623
 URL: https://issues.apache.org/jira/browse/CONNECTORS-623
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1
Reporter: Shinichiro Abe
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 1.1, ManifoldCF 1.2

 Attachments: CONNECTORS-623.patch


 These metadata can be sent to Solr in MCF 1.0.1 but can not be sent in MCF 
 1.1.
 I think it is because of SolrJ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-623) stream_size and stream_name can't be sent

2013-02-18 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580893#comment-13580893
 ] 

Karl Wright commented on CONNECTORS-623:


r1447556 is yet another fix, this time for an illegal argument exception.


 stream_size and stream_name can't be sent
 -

 Key: CONNECTORS-623
 URL: https://issues.apache.org/jira/browse/CONNECTORS-623
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1
Reporter: Shinichiro Abe
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 1.1, ManifoldCF 1.2

 Attachments: CONNECTORS-623.patch


 These metadata can be sent to Solr in MCF 1.0.1 but can not be sent in MCF 
 1.1.
 I think it is because of SolrJ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-623) stream_size and stream_name can't be sent

2013-02-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-623.


Resolution: Fixed

 stream_size and stream_name can't be sent
 -

 Key: CONNECTORS-623
 URL: https://issues.apache.org/jira/browse/CONNECTORS-623
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1
Reporter: Shinichiro Abe
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 1.2, ManifoldCF 1.1

 Attachments: CONNECTORS-623.patch


 These metadata can be sent to Solr in MCF 1.0.1 but can not be sent in MCF 
 1.1.
 I think it is because of SolrJ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-648) Wars fail to deploy in Tomcat 7

2013-02-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-648.


Resolution: Fixed

 Wars fail to deploy in Tomcat 7
 ---

 Key: CONNECTORS-648
 URL: https://issues.apache.org/jira/browse/CONNECTORS-648
 Project: ManifoldCF
  Issue Type: Bug
Affects Versions: ManifoldCF 1.1.1
Reporter: Jan Høydahl
Assignee: Karl Wright
  Labels: Tomcat7
 Fix For: ManifoldCF 1.2

 Attachments: CONNECTORS-648.patch, CONNECTORS-648.patch


 This is a followup on CONNECTORS-568, which I believe is still not solved.
 I deploy mcf-combined-service.war in Tomcat 7.0.35, and I get the error
 {noformat}
 IllegalArgumentException: taglib definition not consistent with specification 
 version
 {noformat}
 Then I unpack the WAR and removes the whole {{jsp-config}} and 
 {{taglibs}} from it, as adviced in 
 http://wiki.metawerx.net/wiki/RemovingTaglibFromWeb.xml
 Restarting, and successful startup, but trying to hit 
 http://localhost:8080/mcf/ brings this error from ./index.jsp line 1
 {noformat}
 org.apache.jasper.JasperException: 
 The absolute uri: http://java.sun.com/jstl/core cannot be resolved 
 in either web.xml or the jar files deployed with this application
 {noformat}
 So then I open the JSPs and see that the taglib refs in {{adminDefaults.jsp}} 
 and {{adminHeaders.jsp}} refer to {{http://java.sun.com/jstl/core}} which is 
 wrong. According to http://stackoverflow.com/tags/jstl/info since JSTL1.1 the 
 path must have /jsp/ in the path. Changing that and restarting brings up the 
 GUI just fine!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-650) SharePoint and Meridio connector need connection timeouts, or they hang when connections are refused

2013-02-21 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-650:
--

 Summary: SharePoint and Meridio connector need connection 
timeouts, or they hang when connections are refused
 Key: CONNECTORS-650
 URL: https://issues.apache.org/jira/browse/CONNECTORS-650
 Project: ManifoldCF
  Issue Type: Bug
  Components: Meridio connector, SharePoint connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


SharePoint and Meridio connector need connection timeouts, or they hang when 
connections are refused.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-650) SharePoint and Meridio connector need connection timeouts, or they hang when connections are refused

2013-02-21 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13583775#comment-13583775
 ] 

Karl Wright commented on CONNECTORS-650:


r1448874


 SharePoint and Meridio connector need connection timeouts, or they hang when 
 connections are refused
 

 Key: CONNECTORS-650
 URL: https://issues.apache.org/jira/browse/CONNECTORS-650
 Project: ManifoldCF
  Issue Type: Bug
  Components: Meridio connector, SharePoint connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 SharePoint and Meridio connector need connection timeouts, or they hang when 
 connections are refused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-650) SharePoint and Meridio connector need connection timeouts, or they hang when connections are refused

2013-02-21 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-650.


Resolution: Fixed

 SharePoint and Meridio connector need connection timeouts, or they hang when 
 connections are refused
 

 Key: CONNECTORS-650
 URL: https://issues.apache.org/jira/browse/CONNECTORS-650
 Project: ManifoldCF
  Issue Type: Bug
  Components: Meridio connector, SharePoint connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 SharePoint and Meridio connector need connection timeouts, or they hang when 
 connections are refused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-63) Add support for reports to API

2013-02-24 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585457#comment-13585457
 ] 

Karl Wright commented on CONNECTORS-63:
---

r1449526 adds the functionality.  Documentation to follow.


 Add support for reports to API
 --

 Key: CONNECTORS-63
 URL: https://issues.apache.org/jira/browse/CONNECTORS-63
 Project: ManifoldCF
  Issue Type: Improvement
  Components: API
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 The API does not currently have implemented support for any ManifoldCF 
 reports.  Add this functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-63) Add support for reports to API

2013-02-24 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-63?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585523#comment-13585523
 ] 

Karl Wright commented on CONNECTORS-63:
---

r1449563 adds a smoke test.


 Add support for reports to API
 --

 Key: CONNECTORS-63
 URL: https://issues.apache.org/jira/browse/CONNECTORS-63
 Project: ManifoldCF
  Issue Type: Improvement
  Components: API
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 The API does not currently have implemented support for any ManifoldCF 
 reports.  Add this functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-652) Document new script engine and API support for reports

2013-02-25 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-652:
--

 Summary: Document new script engine and API support for reports
 Key: CONNECTORS-652
 URL: https://issues.apache.org/jira/browse/CONNECTORS-652
 Project: ManifoldCF
  Issue Type: Task
  Components: Documentation
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


Document new script engine and API support for reports.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-63) Add support for reports to API

2013-02-25 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-63.
---

Resolution: Fixed

r1449642 for CHANGES.txt update

 Add support for reports to API
 --

 Key: CONNECTORS-63
 URL: https://issues.apache.org/jira/browse/CONNECTORS-63
 Project: ManifoldCF
  Issue Type: Improvement
  Components: API
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 The API does not currently have implemented support for any ManifoldCF 
 reports.  Add this functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-651) Add support for query strings into the script client, to support report features

2013-02-25 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585874#comment-13585874
 ] 

Karl Wright commented on CONNECTORS-651:


r1449715 changes the URL addition to use a dedicated path component method.


 Add support for query strings into the script client, to support report 
 features
 

 Key: CONNECTORS-651
 URL: https://issues.apache.org/jira/browse/CONNECTORS-651
 Project: ManifoldCF
  Issue Type: Task
  Components: Scripting client
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Add support for query strings into the script client, to support report 
 features.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-651) Add support for query strings into the script client, to support report features

2013-02-25 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585970#comment-13585970
 ] 

Karl Wright commented on CONNECTORS-651:


r1449762


 Add support for query strings into the script client, to support report 
 features
 

 Key: CONNECTORS-651
 URL: https://issues.apache.org/jira/browse/CONNECTORS-651
 Project: ManifoldCF
  Issue Type: Task
  Components: Scripting client
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Add support for query strings into the script client, to support report 
 features.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-651) Add support for query strings into the script client, to support report features

2013-02-25 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-651.


Resolution: Fixed

 Add support for query strings into the script client, to support report 
 features
 

 Key: CONNECTORS-651
 URL: https://issues.apache.org/jira/browse/CONNECTORS-651
 Project: ManifoldCF
  Issue Type: Task
  Components: Scripting client
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Add support for query strings into the script client, to support report 
 features.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-651) Add support for query strings into the script client, to support report features

2013-02-25 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585974#comment-13585974
 ] 

Karl Wright commented on CONNECTORS-651:


r1449767 for some dangling missed code on commit.


 Add support for query strings into the script client, to support report 
 features
 

 Key: CONNECTORS-651
 URL: https://issues.apache.org/jira/browse/CONNECTORS-651
 Project: ManifoldCF
  Issue Type: Task
  Components: Scripting client
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Add support for query strings into the script client, to support report 
 features.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-652) Document new script engine and API support for reports

2013-02-26 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586944#comment-13586944
 ] 

Karl Wright commented on CONNECTORS-652:


r1450079 for API documentation.


 Document new script engine and API support for reports
 --

 Key: CONNECTORS-652
 URL: https://issues.apache.org/jira/browse/CONNECTORS-652
 Project: ManifoldCF
  Issue Type: Task
  Components: Documentation
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Document new script engine and API support for reports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-652) Document new script engine and API support for reports

2013-02-26 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587193#comment-13587193
 ] 

Karl Wright commented on CONNECTORS-652:


r1450211 for script engine documentation.


 Document new script engine and API support for reports
 --

 Key: CONNECTORS-652
 URL: https://issues.apache.org/jira/browse/CONNECTORS-652
 Project: ManifoldCF
  Issue Type: Task
  Components: Documentation
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Document new script engine and API support for reports.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-653) SharePoint connector should set file name

2013-02-28 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-653:
--

 Summary: SharePoint connector should set file name
 Key: CONNECTORS-653
 URL: https://issues.apache.org/jira/browse/CONNECTORS-653
 Project: ManifoldCF
  Issue Type: Improvement
  Components: SharePoint connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


SharePoint connector should set file name


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-653) SharePoint connector should set file name

2013-02-28 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589858#comment-13589858
 ] 

Karl Wright commented on CONNECTORS-653:


r1451313


 SharePoint connector should set file name
 -

 Key: CONNECTORS-653
 URL: https://issues.apache.org/jira/browse/CONNECTORS-653
 Project: ManifoldCF
  Issue Type: Improvement
  Components: SharePoint connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 SharePoint connector should set file name

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-653) SharePoint connector should set file name

2013-02-28 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-653.


Resolution: Fixed

 SharePoint connector should set file name
 -

 Key: CONNECTORS-653
 URL: https://issues.apache.org/jira/browse/CONNECTORS-653
 Project: ManifoldCF
  Issue Type: Improvement
  Components: SharePoint connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 SharePoint connector should set file name

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-654) A core extension-to-mime-type mapping would be very helpful to have

2013-02-28 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-654:
--

 Summary: A core extension-to-mime-type mapping would be very 
helpful to have
 Key: CONNECTORS-654
 URL: https://issues.apache.org/jira/browse/CONNECTORS-654
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


A number of connectors need to come up with a mime type without having one 
available.  A core mapping of extension to mime type would make this easier.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-654) A core extension-to-mime-type mapping would be very helpful to have

2013-02-28 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-654.


Resolution: Fixed

 A core extension-to-mime-type mapping would be very helpful to have
 ---

 Key: CONNECTORS-654
 URL: https://issues.apache.org/jira/browse/CONNECTORS-654
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 A number of connectors need to come up with a mime type without having one 
 available.  A core mapping of extension to mime type would make this easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-654) A core extension-to-mime-type mapping would be very helpful to have

2013-02-28 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13589890#comment-13589890
 ] 

Karl Wright commented on CONNECTORS-654:


r1451325


 A core extension-to-mime-type mapping would be very helpful to have
 ---

 Key: CONNECTORS-654
 URL: https://issues.apache.org/jira/browse/CONNECTORS-654
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Affects Versions: ManifoldCF 1.2
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 A number of connectors need to come up with a mime type without having one 
 available.  A core mapping of extension to mime type would make this easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-656) Add mimetype and/or filename support to CMIS connector

2013-03-03 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-656:
--

 Summary: Add mimetype and/or filename support to CMIS connector
 Key: CONNECTORS-656
 URL: https://issues.apache.org/jira/browse/CONNECTORS-656
 Project: ManifoldCF
  Issue Type: Improvement
  Components: CMIS connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Piergiorgio Lucidi
 Fix For: ManifoldCF 1.2


Add mimetype and/or filename support to CMIS connector.  The appropriate 
methods are:

RepositoryDocument.setFileName(...)
and
RepositoryDocument.setMimeType(...)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-655) Add mimetype and/or filename support to Alfresco connector

2013-03-03 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-655:
--

 Summary: Add mimetype and/or filename support to Alfresco connector
 Key: CONNECTORS-655
 URL: https://issues.apache.org/jira/browse/CONNECTORS-655
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Alfresco connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Piergiorgio Lucidi
 Fix For: ManifoldCF 1.2


Add mimetype and/or filename support to Alfresco connector.  The appropriate 
methods are:

RepositoryDocument.setFileName(...)
and
RepositoryDocument.setMimeType(...)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-657) Normalize date/timestamps format across connectors

2013-03-04 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592220#comment-13592220
 ] 

Karl Wright commented on CONNECTORS-657:


Hi Maciej,

There is no common date attribute name either.  What metadata are you 
specifically looking at for Solr in that case?

What I recommend is to create a new pair of methods in RepositoryDocument: 
getDatestamp() and setDatestamp().  The format should not be Solr, I think, but 
rather something generic (such as ms since epoch, which we use everywhere 
else).  The Solr connector can then convert it to whatever form it wants - same 
as ElasticSearch, etc.

I'm confused though as to how you are seeing these metadata attributes at all.  
Last updated date usually just gets folded into the version string and is not 
typically sent to the output connector at all.  Or maybe I am not remembering 
properly.



 Normalize date/timestamps format across connectors
 --

 Key: CONNECTORS-657
 URL: https://issues.apache.org/jira/browse/CONNECTORS-657
 Project: ManifoldCF
  Issue Type: Bug
Reporter: Maciej Lizewski

 several connectors add datetime attributes to RepositoryDocument, but they do 
 not have common format. Examples:
 WikiConnector adds last-updated: 2010-10-10T12:34:00Z
 SharedDriveConnector adds last-updated: Thu May 28 17:39:46 CEST 2009
 and so on.
 Solr requires all date/datetime fields to be passed as -MM-DDTHH:II:SSZ
 We need to standardize formats (my recommendation is solr format) or allow to 
 add Date attributes to RepositoryDocument and move formatting to 
 OutputConnector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (CONNECTORS-657) Normalize date/timestamps format across connectors

2013-03-04 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-657:
--

Assignee: Karl Wright

 Normalize date/timestamps format across connectors
 --

 Key: CONNECTORS-657
 URL: https://issues.apache.org/jira/browse/CONNECTORS-657
 Project: ManifoldCF
  Issue Type: Bug
Reporter: Maciej Lizewski
Assignee: Karl Wright

 several connectors add datetime attributes to RepositoryDocument, but they do 
 not have common format. Examples:
 WikiConnector adds last-updated: 2010-10-10T12:34:00Z
 SharedDriveConnector adds last-updated: Thu May 28 17:39:46 CEST 2009
 and so on.
 Solr requires all date/datetime fields to be passed as -MM-DDTHH:II:SSZ
 We need to standardize formats (my recommendation is solr format) or allow to 
 add Date attributes to RepositoryDocument and move formatting to 
 OutputConnector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-658) Livelink connector: Set either the file name or mime type or both in Repository Document

2013-03-04 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592309#comment-13592309
 ] 

Karl Wright commented on CONNECTORS-658:


Thanks, this is great - I should be able to code this up and close the ticket 
shortly.


 Livelink connector: Set either the file name or mime type or both in 
 Repository Document
 

 Key: CONNECTORS-658
 URL: https://issues.apache.org/jira/browse/CONNECTORS-658
 Project: ManifoldCF
  Issue Type: Improvement
  Components: LiveLink connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Livelink connector: Set either the file name or mime type or both in 
 Repository Document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-657) Normalize date/timestamps format across connectors

2013-03-04 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592312#comment-13592312
 ] 

Karl Wright commented on CONNECTORS-657:


r1452356 adds the created and modified date fields to the RepositoryDocument 
structure.


 Normalize date/timestamps format across connectors
 --

 Key: CONNECTORS-657
 URL: https://issues.apache.org/jira/browse/CONNECTORS-657
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework crawler agent
Affects Versions: ManifoldCF 1.2
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 several connectors add datetime attributes to RepositoryDocument, but they do 
 not have common format. Examples:
 WikiConnector adds last-updated: 2010-10-10T12:34:00Z
 SharedDriveConnector adds last-updated: Thu May 28 17:39:46 CEST 2009
 and so on.
 Solr requires all date/datetime fields to be passed as -MM-DDTHH:II:SSZ
 We need to standardize formats (my recommendation is solr format) or allow to 
 add Date attributes to RepositoryDocument and move formatting to 
 OutputConnector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-658) Livelink connector: Set either the file name or mime type or both in Repository Document

2013-03-04 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592336#comment-13592336
 ] 

Karl Wright commented on CONNECTORS-658:


r1452375


 Livelink connector: Set either the file name or mime type or both in 
 Repository Document
 

 Key: CONNECTORS-658
 URL: https://issues.apache.org/jira/browse/CONNECTORS-658
 Project: ManifoldCF
  Issue Type: Improvement
  Components: LiveLink connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Livelink connector: Set either the file name or mime type or both in 
 Repository Document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-658) Livelink connector: Set either the file name or mime type or both in Repository Document

2013-03-04 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-658.


Resolution: Fixed

 Livelink connector: Set either the file name or mime type or both in 
 Repository Document
 

 Key: CONNECTORS-658
 URL: https://issues.apache.org/jira/browse/CONNECTORS-658
 Project: ManifoldCF
  Issue Type: Improvement
  Components: LiveLink connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 Livelink connector: Set either the file name or mime type or both in 
 Repository Document.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-657) Normalize date/timestamps format across connectors

2013-03-04 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592724#comment-13592724
 ] 

Karl Wright commented on CONNECTORS-657:


r1452563 adds standard support to Solr connector.

I think this is now ready to try out, if you want to synch up, build, and see 
how it works for you.


 Normalize date/timestamps format across connectors
 --

 Key: CONNECTORS-657
 URL: https://issues.apache.org/jira/browse/CONNECTORS-657
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework crawler agent
Affects Versions: ManifoldCF 1.2
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 several connectors add datetime attributes to RepositoryDocument, but they do 
 not have common format. Examples:
 WikiConnector adds last-updated: 2010-10-10T12:34:00Z
 SharedDriveConnector adds last-updated: Thu May 28 17:39:46 CEST 2009
 and so on.
 Solr requires all date/datetime fields to be passed as -MM-DDTHH:II:SSZ
 We need to standardize formats (my recommendation is solr format) or allow to 
 add Date attributes to RepositoryDocument and move formatting to 
 OutputConnector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-657) Normalize date/timestamps format across connectors

2013-03-05 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593244#comment-13593244
 ] 

Karl Wright commented on CONNECTORS-657:


r1452696 adds support to the SharePoint connector.


 Normalize date/timestamps format across connectors
 --

 Key: CONNECTORS-657
 URL: https://issues.apache.org/jira/browse/CONNECTORS-657
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework crawler agent
Affects Versions: ManifoldCF 1.2
Reporter: Maciej Lizewski
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 several connectors add datetime attributes to RepositoryDocument, but they do 
 not have common format. Examples:
 WikiConnector adds last-updated: 2010-10-10T12:34:00Z
 SharedDriveConnector adds last-updated: Thu May 28 17:39:46 CEST 2009
 and so on.
 Solr requires all date/datetime fields to be passed as -MM-DDTHH:II:SSZ
 We need to standardize formats (my recommendation is solr format) or allow to 
 add Date attributes to RepositoryDocument and move formatting to 
 OutputConnector.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-642) Need an ElasticSearch plugin for enforcing ManifoldCF security

2013-03-06 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595275#comment-13595275
 ] 

Karl Wright commented on CONNECTORS-642:


Got some education from a colleague as to how to extend Elastic Search.  The 
key and important facts are these:

- Built solely with Maven
- Uses annotation to inject plugins
- Extended by annotations that describe REST functionality
- No way to insert a module or class into the ElasticSearch execution chain

Therefore, the likely design for a component will include the following.

(1) A java-level API method which accepts a Lucene Query and produces a 
properly qualified/wrapped Query as output
(2) Possibly an example REST method example that can be used as a replacement 
for the standard ElasticSearch query parsing REST method entry point, except at 
a different URL, which calls the java-level API method mentioned in (1) above
(3) A Maven build which produces the java-level API jar
(4) A Maven build which produces the REST plugin example
(5) Tests which exercise the whole thing, under maven

More details when I have them...
 

 Need an ElasticSearch plugin for enforcing ManifoldCF security
 --

 Key: CONNECTORS-642
 URL: https://issues.apache.org/jira/browse/CONNECTORS-642
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 ElasticSearch is becoming popular and we need to support it fully.  In order 
 for that to happen, we really need an ElasticSearch ManifoldCF plugin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-642) Need an ElasticSearch plugin for enforcing ManifoldCF security

2013-03-07 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595924#comment-13595924
 ] 

Karl Wright commented on CONNECTORS-642:


Further inspection indicates that the following class, 
https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/rest/action/search/RestSearchAction.java
 , is where all the action is.  Unfortunately this does not give a good idea 
how to override query generation, since the generation goes via a class called 
QueryStringQueryBuilder, which takes many parameters and presumably constructs 
a query string in the end.  The builder is passed to a SearchSourceBuilder 
instance, which is added into a SearchRequest object.  The SearchRequest object 
is apparently what is used to execute the query.

It is not obvious at all how to hook into this chain without replacing the 
whole thing.  Looking for suggestions on this.


 Need an ElasticSearch plugin for enforcing ManifoldCF security
 --

 Key: CONNECTORS-642
 URL: https://issues.apache.org/jira/browse/CONNECTORS-642
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 ElasticSearch is becoming popular and we need to support it fully.  In order 
 for that to happen, we really need an ElasticSearch ManifoldCF plugin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-642) Need an ElasticSearch plugin for enforcing ManifoldCF security

2013-03-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597060#comment-13597060
 ] 

Karl Wright commented on CONNECTORS-642:


It looks like the SearchSourceBuilder class has a number of different (String) 
fields - a query, a filter, etc.  This is hopeful but since they are strings I 
bet this is before query parsing.  Elsewhere in the code, it looks like Elastic 
Search has a parallel query builder arrangement we might be able to use, which 
has no Lucene dependencies, e.g. org.elasticsearch.index.query.BoolQueryBuilder 
, which basically would be a query structure parallel to the Lucene 
equivalents.  But it is not clear how to use such a query builder productively. 
 Still looking for answers to that one.


 Need an ElasticSearch plugin for enforcing ManifoldCF security
 --

 Key: CONNECTORS-642
 URL: https://issues.apache.org/jira/browse/CONNECTORS-642
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 ElasticSearch is becoming popular and we need to support it fully.  In order 
 for that to happen, we really need an ElasticSearch ManifoldCF plugin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-642) Need an ElasticSearch plugin for enforcing ManifoldCF security

2013-03-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597166#comment-13597166
 ] 

Karl Wright commented on CONNECTORS-642:


Further discussion yielded the following general plan:

- What we want at a basic level is to provide is a way of modifying the filter 
in a SearchSourceBuilder object.  There are FilterBuilder classes under 
org.elasticsearch.index.query.FilterBuilder that correspond to the Lucene 
entities I'm familiar with.
- The basic way people would be expected to use this would be to enhance 
client.prepareSearch(myindex), which returns a SearchRequestBuilder object, 
which in turn wraps a SearchSourceBuilder
- The example REST action we'd provide would provide the equivalent of 
org.elasticsearch.rest.action.search.RestSearchAction, but with the enhanced 
result from client.prepareSearch().  I'm still a bit fuzzy on whether I should 
do this by creating a wrapped client and calling the original method, or some 
other way.  Extension of the client does not seem reasonable since what is 
passed into REST methods is an interface, not a class.  A wrapped client would 
be very easy to use, but would be sensitive to changes in the client interface, 
obviously.


 Need an ElasticSearch plugin for enforcing ManifoldCF security
 --

 Key: CONNECTORS-642
 URL: https://issues.apache.org/jira/browse/CONNECTORS-642
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 ElasticSearch is becoming popular and we need to support it fully.  In order 
 for that to happen, we really need an ElasticSearch ManifoldCF plugin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597221#comment-13597221
 ] 

Karl Wright commented on CONNECTORS-661:


Turns out there's a way to do it, which requires modification of the 
ModifiedSolrHttpServer class to pull off.  I'll be experimenting with that 
shortly.

 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597491#comment-13597491
 ] 

Karl Wright commented on CONNECTORS-661:


r1454518 enables expect-continue, which according the Oleg might well fix the 
problem.


 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-661.


Resolution: Fixed

 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-662) Support ISO8601 dates in RSS connector

2013-03-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597596#comment-13597596
 ] 

Karl Wright commented on CONNECTORS-662:


r1454590


 Support ISO8601 dates in RSS connector
 --

 Key: CONNECTORS-662
 URL: https://issues.apache.org/jira/browse/CONNECTORS-662
 Project: ManifoldCF
  Issue Type: Improvement
  Components: RSS connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 While RSS fields explicitly use RFC822 dates, I've been seeing feeds with 
 ISO8601 dates instead.  So the RSS connector might as well support that 
 format too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CONNECTORS-662) Support ISO8601 dates in RSS connector

2013-03-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-662.


Resolution: Fixed

 Support ISO8601 dates in RSS connector
 --

 Key: CONNECTORS-662
 URL: https://issues.apache.org/jira/browse/CONNECTORS-662
 Project: ManifoldCF
  Issue Type: Improvement
  Components: RSS connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 While RSS fields explicitly use RFC822 dates, I've been seeing feeds with 
 ISO8601 dates instead.  So the RSS connector might as well support that 
 format too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-642) Need an ElasticSearch plugin for enforcing ManifoldCF security

2013-03-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597903#comment-13597903
 ] 

Karl Wright commented on CONNECTORS-642:


I've converted the core class to just build an ES FilterBuilder that would 
enforce authorization if applied.  It turns out to be impossible to modify a 
filter within a SearchRequestBuilder object, so that won't work.  We'll have to 
leave it with instructing people to use our FilterBuilder as a component of 
their own filters.


 Need an ElasticSearch plugin for enforcing ManifoldCF security
 --

 Key: CONNECTORS-642
 URL: https://issues.apache.org/jira/browse/CONNECTORS-642
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Elastic Search connector
Affects Versions: ManifoldCF 1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 1.2


 ElasticSearch is becoming popular and we need to support it fully.  In order 
 for that to happen, we really need an ElasticSearch ManifoldCF plugin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CONNECTORS-663) ManifoldCF needs the ability to not always check for deletion on a crawl

2013-03-11 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-663:
--

 Summary: ManifoldCF needs the ability to not always check for 
deletion on a crawl
 Key: CONNECTORS-663
 URL: https://issues.apache.org/jira/browse/CONNECTORS-663
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework agents process
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF next


The ManifoldCF framework's crawling model always brings the index in synch with 
the repository by the end of the job.  Unfortunately, for many repositories, 
the incremental nature of ManifoldCF is lost in part because deletion tracking 
is not done by the repository.  ManifoldCF could therefore benefit by the 
ability to have two different job run cycles: (1) A full run, as is done now, 
and (2) a partial run, which does not necessarily attempt to clean up 
deletions.  This of course only makes sense if subsequent job runs have the 
ability to do the deletion cleanup.

In principle, I believe this can can work but has significant implications in 
the following areas:

- Job states - there needs to be a new set of job states corresponding to which 
type of job run is selected;
- UI - there needs to be a way of telling ManifoldCF what kind of job run is 
desired;
- API - same problem as UI;
- Job scheduling; we need the ability to determine what kind of job run is done 
when, which also has schema implications



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-11 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reopened CONNECTORS-661:



The fix the HttpClient team recommended didn't work.  Have to try another way.

 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-11 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598998#comment-13598998
 ] 

Karl Wright commented on CONNECTORS-661:


HTTP only allows one response line per response, that's why.  Sounds like a bug 
in Resin, doesn't it?


 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-11 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598998#comment-13598998
 ] 

Karl Wright edited comment on CONNECTORS-661 at 3/11/13 5:10 PM:
-

HTTP only allows one response line per response, that's why.  If more than one 
response line is coming back per response, then it sounds like a bug in Resin, 
doesn't it?

If multiple responses are taking place, then the 401 Unauthorized must come 
INSTEAD of the 100 Continue, not after it.

You can try -vvv to confirm if you like.





  was (Author: kwri...@metacarta.com):
HTTP only allows one response line per response, that's why.  Sounds like a 
bug in Resin, doesn't it?

  
 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-11 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599035#comment-13599035
 ] 

Karl Wright commented on CONNECTORS-661:


Yeah, having TWO status lines is a definite no-no as far as HTTP is concerned.  
What version of Resin is this, and is there a Resin ticket open for the 
problem?  If not, maybe you should consider opening one.

FWIW, it looks like Apache web server had problems with this functionality 
also, back in 2007.

http://www.gossamer-threads.com/lists/apache/users/340406

As far as MCF is concerned - I'm leery of fixing problems in the wrong place in 
order to work around somebody else's bugs.  I think the proper course of action 
is to see if you can find a solution on your end - perhaps with a patched 
version of Resin.  But absolutely a ticket should be created against Resin, or 
this will not get addressed.

If a Resin fix proves impossible, then I'd consider creating a branch which 
includes Maciej's MCF fix to pre-emptively send basic auth credentials, and we 
can then generate a patch and attach it to this ticket for other resin users to 
apply.  I can certainly help you code this up if you need it.



 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-11 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599078#comment-13599078
 ] 

Karl Wright commented on CONNECTORS-661:


Hi Erlend,

To me it appears that curl logging does something screwy in transmitting the 
data in this transaction.  In theory it should wait for the 100 Continue before 
it continues, but it starts to transmit the body anyway.  So this line is 
confused:

 xml/HTTP/1.1 100 Continue

The HTTP/1.1 100 Continue is clearly coming from the server but is not 
properly separated and does not have the proper  preceding it.

The wire logging you initially sent me is more helpful in this regard:

{code}
DEBUG 2013-03-11 14:54:46,936 (Thread-812) -  POST /solr/uio/update/extract 
HTTP/1.1
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  Content-Charset: UTF-8
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  User-Agent: 
Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  Transfer-Encoding: chunked
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  Content-Type: 
multipart/form-data; boundary=HKJXjK-T02nPdAGqPhEIPWFLYuqM7495HYpg6
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  Host: solr-prod01.uio.no:443
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  Connection: Keep-Alive
DEBUG 2013-03-11 14:54:46,937 (Thread-812) -  Expect: 100-continue
DEBUG 2013-03-11 14:54:46,940 (Thread-812) -  HTTP/1.1 100 Continue[\r][\n]
DEBUG 2013-03-11 14:54:46,940 (Thread-812) -  [\r][\n]
DEBUG 2013-03-11 14:54:46,940 (Thread-812) - Receiving response: HTTP/1.1 100 
Continue
DEBUG 2013-03-11 14:54:46,940 (Thread-812) -  HTTP/1.1 100 Continue
DEBUG 2013-03-11 14:54:46,942 (Thread-812) -  c1[\r][\n]
DEBUG 2013-03-11 14:54:46,942 (Thread-812) -  
--HKJXjK-T02nPdAGqPhEIPWFLYuqM7495HYpg6[\r][\n]
DEBUG 2013-03-11 14:54:46,942 (Thread-812) -  Content-Disposition: 
form-data; name=literal.id[\r][\n]
DEBUG 2013-03-11 14:54:46,942 (Thread-812) -  Content-Type: text/plain; 
charset=UTF-8[\r][\n]
DEBUG 2013-03-11 14:54:46,942 (Thread-812) -  Content-Transfer-Encoding: 
8bit[\r][\n]
DEBUG 2013-03-11 14:54:46,942 (Thread-812) -  [\r][\n]
...
DEBUG 2013-03-11 14:54:46,962 (Thread-812) -  [\r][\n]
DEBUG 2013-03-11 14:54:46,962 (Thread-812) -  0[\r][\n]
DEBUG 2013-03-11 14:54:46,962 (Thread-812) -  [\r][\n]
DEBUG 2013-03-11 14:54:46,963 (Thread-812) -  HTTP/1.1 401 
Unauthorized[\r][\n]
DEBUG 2013-03-11 14:54:46,963 (Thread-812) -  Date: Mon, 11 Mar 2013 
13:54:46 GMT[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Server: Apache/2.2.22 
(Unix)[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  WWW-Authenticate: Basic 
realm=resin[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Content-Length: 159[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Content-Type: text/html; 
charset=utf-8[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  X-UA-Compatible: 
IE=EmulateIE7[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Connection: close[\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  [\r][\n]
DEBUG 2013-03-11 14:54:46,964 (Thread-812) - Receiving response: HTTP/1.1 401 
Unauthorized
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  HTTP/1.1 401 Unauthorized
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Date: Mon, 11 Mar 2013 13:54:46 
GMT
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Server: Apache/2.2.22 (Unix)
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  WWW-Authenticate: Basic 
realm=resin
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Content-Length: 159
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Content-Type: text/html; 
charset=utf-8
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  X-UA-Compatible: IE=EmulateIE7
DEBUG 2013-03-11 14:54:46,964 (Thread-812) -  Connection: close
DEBUG 2013-03-11 14:54:46,964 (Thread-812) - Authentication required
{code}

Here you can clearly see that the first request proceeds up to the headers, 
then there's a newline, then a 100 Continue comes back from the server, then 
the rest of the data goes over, then finally the 401 comes back.

Furthermore, note the Agent header: this is Apache Web Server, not Resin, that 
is doing this, esp. if the basic auth is configured in Apache.  If so it sounds 
like exactly the problem that was described in 2007.


 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  

[jira] [Commented] (CONNECTORS-663) ManifoldCF needs the ability to not always check for deletion on a crawl

2013-03-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599873#comment-13599873
 ] 

Karl Wright commented on CONNECTORS-663:


I'm thinking that the basic new job cycle will be called a minimal cycle, and 
will consist of:

- Seeding
- Processing the seeded documents, and all discovered documents

So, for a MODEL_ADD connector, only additions will be crawled.  For a 
MODEL_ADD_CHANGE connector, additions and modifications will be crawled, etc.  
The UI will have a Start minimal clickable link in addition to a Start link.



 ManifoldCF needs the ability to not always check for deletion on a crawl
 

 Key: CONNECTORS-663
 URL: https://issues.apache.org/jira/browse/CONNECTORS-663
 Project: ManifoldCF
  Issue Type: New Feature
  Components: Framework agents process
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF next


 The ManifoldCF framework's crawling model always brings the index in synch 
 with the repository by the end of the job.  Unfortunately, for many 
 repositories, the incremental nature of ManifoldCF is lost in part because 
 deletion tracking is not done by the repository.  ManifoldCF could therefore 
 benefit by the ability to have two different job run cycles: (1) A full run, 
 as is done now, and (2) a partial run, which does not necessarily attempt to 
 clean up deletions.  This of course only makes sense if subsequent job runs 
 have the ability to do the deletion cleanup.
 In principle, I believe this can can work but has significant implications in 
 the following areas:
 - Job states - there needs to be a new set of job states corresponding to 
 which type of job run is selected;
 - UI - there needs to be a way of telling ManifoldCF what kind of job run is 
 desired;
 - API - same problem as UI;
 - Job scheduling; we need the ability to determine what kind of job run is 
 done when, which also has schema implications

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-661) Solr Connector cannot index documents when Solr is protected by authentication

2013-03-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600892#comment-13600892
 ] 

Karl Wright commented on CONNECTORS-661:


So is it true that you were able to show that the problem is not in Resin; it 
was in either the version of Apache webserver being used or in mod_caucho?

If it winds up being an interaction between Apache and mod_caucho, I think this 
ticket can be closed as far as MCF is concerned.  If it winds up being a bug in 
Apache that is not resolved, we'll have to think long an hard about a hack fix, 
unfortunately, because of the prevalence of Apache out there.  Please let us 
know.





 Solr Connector cannot index documents when Solr is protected by authentication
 --

 Key: CONNECTORS-661
 URL: https://issues.apache.org/jira/browse/CONNECTORS-661
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 1.1.1
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Critical
 Fix For: ManifoldCF 1.2


 The Solr connector cannot deal with Solr servers that are protected by 
 authentication.  The reason is that HttpClient forces a retry based on the 
 WWW-authenticate header that is returned, but the HttpEntity is not 
 resettable and thus the retry fails for that reason.
 This is a regression, because the Solr connector used to properly support 
 basic auth, and now no longer does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


<    2   3   4   5   6   7   8   9   10   11   >