[jira] Commented: (CONNECTORS-100) DB lock timeout

2010-09-07 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906793#action_12906793
 ] 

Karl Wright commented on CONNECTORS-100:


Just in case, I've run a spinner test pounding on the API at the same time the 
UI is being pounded on.  This is a Windows laptop with the current trunk  
version.  No such errors appear for me.


 DB lock timeout
 ---

 Key: CONNECTORS-100
 URL: https://issues.apache.org/jira/browse/CONNECTORS-100
 Project: Apache Connectors Framework
  Issue Type: Bug
  Components: Framework core
 Environment: Running unmodified dist/example from trunk/ using the 
 default configuration.
Reporter: Andrzej Bialecki 

 When a job is started and running (via crawler-ui) occasionally it's not 
 possible to display a list of running jobs. The problem persists even after 
 restarting ACF. The following exception is thrown in the console:
 {code}
 org.apache.acf.core.interfaces.ACFException: Database exception: Exception 
 doing query: A lock could not be obtained within the time requested
 at 
 org.apache.acf.core.database.Database.executeViaThread(Database.java:421)
 at 
 org.apache.acf.core.database.Database.executeUncachedQuery(Database.java:465)
 at 
 org.apache.acf.core.database.Database$QueryCacheExecutor.create(Database.java:1072)
 at 
 org.apache.acf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
 at 
 org.apache.acf.core.database.Database.executeQuery(Database.java:167)
 at 
 org.apache.acf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:727)
 at 
 org.apache.acf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:5611)
 at 
 org.apache.acf.crawler.jobs.JobManager.getAllStatus(JobManager.java:5549)
 at 
 org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:316)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377)
 at 
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)
 at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
 at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.sql.SQLTransactionRollbackException: A lock could not be 
 obtained within the time requested
 at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
 at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
 at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
 at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
 at org.apache.acf.core.database.Database.execute(Database.java:526)
 at 
 org.apache.acf.core.database.Database$ExecuteQueryThread.run(Database.java:381)
 

[jira] Commented: (CONNECTORS-99) REST API serialization inconsistency

2010-09-07 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906799#action_12906799
 ] 

Andrzej Bialecki  commented on CONNECTORS-99:
-

Yes, it's a wish :) I can live with the way things are, and I can always check 
whether it's a naked object or an array... it's just not too friendly for the 
client. At least it would be good to document this behavior.

 REST API serialization inconsistency
 

 Key: CONNECTORS-99
 URL: https://issues.apache.org/jira/browse/CONNECTORS-99
 Project: Apache Connectors Framework
  Issue Type: Wish
  Components: API
 Environment: ACF trunk.
Reporter: Andrzej Bialecki 
Priority: Minor

 There is some inconsistency in REST APIs that makes the returned values more 
 difficult to process than necessary. It boils down to the fact that lists of 
 values are serialized into JSON arrays only when there is more than 1 element 
 on the list, but they are serialized into plain JSON objects when there is 0 
 or 1 element on the list. Example:
 * listings of jobs, connectors, connections, repositories etc. all suffer 
 from this symptom:
 {code}
 * 1 element:
   {job:{id:1283811504796,description:job 1 ...
 * 2 elements:
   {job:[{id:1283811504796,description:job 1 ...
 {code}
 * nested elements, such as e.g. job metadata:
 {code}
 1 element:
   
 metadata:{_value_:,_attribute_name:jobKey1,_attribute_value:jobVal1}
 2 elements:
   
 metadata:[{_value_:,_attribute_name:jobKey1,_attribute_value:jobVal1},{_value_:,_attribute_name:jobKey2,_attribute_value:jobVal2}]
 {code}
 In my opinion, in all the above cases the API should always return a JSON 
 array for those elements that can occur with cardinality  1.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-100) DB lock timeout

2010-09-07 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12906805#action_12906805
 ] 

Karl Wright commented on CONNECTORS-100:


Ok, that's a rather different scenario than you first described.  Mainly, the 
database is under high load conditions, because you are in fact crawling - and 
it is possible that you are crawling flat-out, without any significant 
throttling, as well.  It's entirely possible that Derby's default lock timeout 
is simply not long enough to support those conditions.

If you want to continue to use the quick-start for your crawl task, then you 
will probably want to research how to increase this timeout using the derby 
configuration file.  My suggestion though would be to try using postgresql 
instead, since that has much more well-known behavior characteristics.  You can 
use postgresql with the quickstart by changing the line in properties.xml from:

property name=org.apache.acf.databaseimplementationclass 
value=org.apache.acf.core.database.DBInterfaceDerby/

to

property name=org.apache.acf.databaseimplementationclass 
value=org.apache.acf.core.database.DBInterfacePostgreSQL/

You will, of course, also need to install Postgresql as well.





 DB lock timeout
 ---

 Key: CONNECTORS-100
 URL: https://issues.apache.org/jira/browse/CONNECTORS-100
 Project: Apache Connectors Framework
  Issue Type: Bug
  Components: Framework core
 Environment: Running unmodified dist/example from trunk/ using the 
 default configuration.
Reporter: Andrzej Bialecki 

 When a job is started and running (via crawler-ui) occasionally it's not 
 possible to display a list of running jobs. The problem persists even after 
 restarting ACF. The following exception is thrown in the console:
 {code}
 org.apache.acf.core.interfaces.ACFException: Database exception: Exception 
 doing query: A lock could not be obtained within the time requested
 at 
 org.apache.acf.core.database.Database.executeViaThread(Database.java:421)
 at 
 org.apache.acf.core.database.Database.executeUncachedQuery(Database.java:465)
 at 
 org.apache.acf.core.database.Database$QueryCacheExecutor.create(Database.java:1072)
 at 
 org.apache.acf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
 at 
 org.apache.acf.core.database.Database.executeQuery(Database.java:167)
 at 
 org.apache.acf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:727)
 at 
 org.apache.acf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:5611)
 at 
 org.apache.acf.crawler.jobs.JobManager.getAllStatus(JobManager.java:5549)
 at 
 org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:316)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377)
 at 
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)
 at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
 at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.sql.SQLTransactionRollbackException: A lock could not be 
 obtained within the time requested
 at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
 at 

[jira] Created: (CONNECTORS-101) File system connector would benefit by default crawling rules

2010-09-07 Thread Karl Wright (JIRA)
File system connector would benefit by default crawling rules
-

 Key: CONNECTORS-101
 URL: https://issues.apache.org/jira/browse/CONNECTORS-101
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Priority: Minor


When you add a path to a file system connector job, it should automatically put 
in rules that cause it to include all files and directories under that path.  
This makes it easier to use, and more easily demonstrable too.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-102) Web Connector should have a prepopulated bandwidth throttle

2010-09-07 Thread Karl Wright (JIRA)
Web Connector should have a prepopulated bandwidth throttle
---

 Key: CONNECTORS-102
 URL: https://issues.apache.org/jira/browse/CONNECTORS-102
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Web connector
Reporter: Karl Wright
Priority: Minor


When you first create a web connector connection, the bandwidth tab should come 
prepopulated with a bandwidth throttle that has the following data:

Description: All domains
Bin regular expression: blank
Max connections: 2
Max KB per second: 64
Max fetches per minute: 12

Too many casual users of ACF have been crawling without any throttling, and 
that's going to give ACF a bad name in the long run,


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-103) RSS connector: Have better initial default values for throttling

2010-09-07 Thread Karl Wright (JIRA)
RSS connector: Have better initial default values for throttling


 Key: CONNECTORS-103
 URL: https://issues.apache.org/jira/browse/CONNECTORS-103
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: RSS connector
Reporter: Karl Wright
Priority: Minor


When you first create an rss connector connection, the bandwidth tab should 
come prepopulated with the following values:

Max connections per server: 2
Max KB per second per server: 64
Max fetches per minute per server: 12

Too many casual users of ACF have been crawling without any throttling, and 
that's going to give ACF a bad name in the long run,

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-104) Make it easier to limit a web crawl to a single site

2010-09-07 Thread Jack Krupansky (JIRA)
Make it easier to limit a web crawl to a single site


 Key: CONNECTORS-104
 URL: https://issues.apache.org/jira/browse/CONNECTORS-104
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: Web connector
Affects Versions: LCF Release 0.5
Reporter: Jack Krupansky
Priority: Minor
 Fix For: LCF Release 0.5


Unless the user explicitly enters an include regex carefully, a web crawl can 
quickly get out of control and start crawling the entire web when all the user 
may really want is to crawl just a single web site or portion thereof. So, it 
would be preferable if either by default or with a simple button the crawl 
could be limited to the seed web site(s).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-101) File system connector would benefit by default crawling rules

2010-09-07 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-101.


Fix Version/s: LCF Release 0.5
   Resolution: Fixed

r993551.

By the way, the UI is really pretty bad for this connector also, so I may open 
a ticket to clean that up as well.


 File system connector would benefit by default crawling rules
 -

 Key: CONNECTORS-101
 URL: https://issues.apache.org/jira/browse/CONNECTORS-101
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: LCF Release 0.5


 When you add a path to a file system connector job, it should automatically 
 put in rules that cause it to include all files and directories under that 
 path.  This makes it easier to use, and more easily demonstrable too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-105) File system connector UI no longer adheres to connector UI standards, needs to be updated

2010-09-07 Thread Karl Wright (JIRA)
File system connector UI no longer adheres to connector UI standards, needs to 
be updated
-

 Key: CONNECTORS-105
 URL: https://issues.apache.org/jira/browse/CONNECTORS-105
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Priority: Minor


The file system connector specification Paths tab no longer adheres to the 
prevailing connector standard, which suggests a table for rule list displays.  
The connector UI should be updated.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CONNECTORS-105) File system connector UI no longer adheres to connector UI standards, needs to be updated

2010-09-07 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-105:
--

Assignee: Karl Wright

 File system connector UI no longer adheres to connector UI standards, needs 
 to be updated
 -

 Key: CONNECTORS-105
 URL: https://issues.apache.org/jira/browse/CONNECTORS-105
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: LCF Release 0.5


 The file system connector specification Paths tab no longer adheres to the 
 prevailing connector standard, which suggests a table for rule list displays. 
  The connector UI should be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-105) File system connector UI no longer adheres to connector UI standards, needs to be updated

2010-09-07 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-105.


Fix Version/s: LCF Release 0.5
   Resolution: Fixed

r993565.


 File system connector UI no longer adheres to connector UI standards, needs 
 to be updated
 -

 Key: CONNECTORS-105
 URL: https://issues.apache.org/jira/browse/CONNECTORS-105
 Project: Apache Connectors Framework
  Issue Type: Improvement
  Components: File system connector
Reporter: Karl Wright
Assignee: Karl Wright
Priority: Minor
 Fix For: LCF Release 0.5


 The file system connector specification Paths tab no longer adheres to the 
 prevailing connector standard, which suggests a table for rule list displays. 
  The connector UI should be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.