[jira] [Commented] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library

2011-05-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038507#comment-13038507
 ] 

Jan Høydahl commented on CONNECTORS-19:
---

Yea, guess the net effect is about the same if MCF handles the threads or SolrJ 
does. Guess we could set threadCount=1 and make buffer size configurable. The 
point of switching to SolrJ would be the assumption that code is more stable 
and performant. Also SOLR-1565 could make things even faster.

 Look into converting SOLR connector to use SolrJ java library
 -

 Key: CONNECTORS-19
 URL: https://issues.apache.org/jira/browse/CONNECTORS-19
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Reporter: Karl Wright
Priority: Minor

 The SOLR connector currently uses its own multipart post code.  It might be a 
 good idea to convert it to use the SolrJ client api jar instead.  This would 
 require license confirmation, plus research to make sure there are no jar 
 conflicts as a result, with any other connector.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library

2011-05-24 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038521#comment-13038521
 ] 

Karl Wright commented on CONNECTORS-19:
---

That's why this ticket was created - to explore using solrj instead of the 
homegrown code currently in the connector.  However, there are issues we need 
to consider before solrj would be an option.  The guaranteed delivery problem 
is one such.  But also if SolrJ spins up its own threads it might well make it 
difficult to shut ManifoldCF down properly, depending on how those threads are 
created.  Just as it is better to use an application server's thread pool when 
you are a web application, the same principles apply for threads created by 
connectors and their supporting libraries.  If you have access to ManifoldCF in 
Action, you might want to have a look at chapters 5 and 6 for details.

However, that does not rule solrj out, it just means we need to be cautious if 
and when the Solr connector is transitioned to use it.  If you want to explore 
this in detail by all means feel free - patches are definitely welcome.


 Look into converting SOLR connector to use SolrJ java library
 -

 Key: CONNECTORS-19
 URL: https://issues.apache.org/jira/browse/CONNECTORS-19
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Reporter: Karl Wright
Priority: Minor

 The SOLR connector currently uses its own multipart post code.  It might be a 
 good idea to convert it to use the SolrJ client api jar instead.  This would 
 require license confirmation, plus research to make sure there are no jar 
 conflicts as a result, with any other connector.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin

2011-05-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038651#comment-13038651
 ] 

Jan Høydahl commented on CONNECTORS-202:


I've created a SOLR patch to allow commitWtihin as a request parameter.

I guess this means that on the MCF side we could simply set a Name/Value pair 
on the SolrOutputConnector or change from /update/extract to 
/update/extract?commitWithin=1.

But probably for usability's sake it makes sense to state it as an explicit 
param on the Commits tab below Commit at end of every job checkbox.

 SOLR connector suport for commitWithin
 --

 Key: CONNECTORS-202
 URL: https://issues.apache.org/jira/browse/CONNECTORS-202
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.2
Reporter: Jan Høydahl
  Labels: commit

 The output connection must support commitWithin 
 (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22)
  in addition to sending a commit() at the end of a job.
 This allows for efficient handling of commits on the Solr side.
 The parameter should ideally be configurable per job. In that way you could 
 say that for Important job commitWithin=10s while for Big crawl job, 
 commitWithin=600s.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin

2011-05-24 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038658#comment-13038658
 ] 

Karl Wright commented on CONNECTORS-202:


Yes, making it explicit is preferred.  But I thought you wanted to be able to 
set this on a per-job basis?


 SOLR connector suport for commitWithin
 --

 Key: CONNECTORS-202
 URL: https://issues.apache.org/jira/browse/CONNECTORS-202
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.2
Reporter: Jan Høydahl
  Labels: commit

 The output connection must support commitWithin 
 (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22)
  in addition to sending a commit() at the end of a job.
 This allows for efficient handling of commits on the Solr side.
 The parameter should ideally be configurable per job. In that way you could 
 say that for Important job commitWithin=10s while for Big crawl job, 
 commitWithin=600s.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin

2011-05-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038666#comment-13038666
 ] 

Jan Høydahl commented on CONNECTORS-202:


Right, that would be the best. If the param is set per job it should override 
the default on the output connector. This could be a pet project for me to 
contribute something simple to MCF :)

 SOLR connector suport for commitWithin
 --

 Key: CONNECTORS-202
 URL: https://issues.apache.org/jira/browse/CONNECTORS-202
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 0.2
Reporter: Jan Høydahl
  Labels: commit

 The output connection must support commitWithin 
 (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22)
  in addition to sending a commit() at the end of a job.
 This allows for efficient handling of commits on the Solr side.
 The parameter should ideally be configurable per job. In that way you could 
 say that for Important job commitWithin=10s while for Big crawl job, 
 commitWithin=600s.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Turning on Logging

2011-05-24 Thread hokie
What is the properties.xml line to turn on logging say for connectors 
and db.  Is it the following?  Do you say value = true or enabled or on?


property name=org.apache.manifoldcf.db value=true/
property name=org.apache.manifoldcf.connectors value=true/

Thanks,
Farzad.


Re: Turning on Logging

2011-05-24 Thread Karl Wright
The rootLogger in the .ini file may be set to DEBUG, but the
properties.xml loggers explicitly control each subsystem, so those win
(for subsystems).

Karl


On Tue, May 24, 2011 at 3:17 PM,  ho...@farzad.net wrote:
 So then not only I can control the logging from the logging.ini file, I can
 control it via the property.  Meaning, if the property is set to WARN, but
 the ini file says DEBUG for the appender, all I get is WARN?

 Here is what I have in my .ini file:

 #Two appenders defined, one for a file log (MAIN) and the other to standard
 out (STDOUT)
 log4j.rootLogger=DEBUG, MAIN, STDOUT
 log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
 log4j.appender.MAIN.File=logs/manifoldcf.log
 log4j.appender.MAIN.MaxFileSize=1KB
 log4j.appender.MAIN.MaxBackupIndex=10
 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
 log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n

 log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
 log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout
 log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n


 On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com wrote:

 property name=org.apache.manifoldcf.db value=DEBUG/
  property name=org.apache.manifoldcf.connectors value=DEBUG/

 Karl

 On Tue, May 24, 2011 at 3:08 PM,  ho...@farzad.net wrote:

 What is the properties.xml line to turn on logging say for connectors and
 db.  Is it the following?  Do you say value = true or enabled or on?

 property name=org.apache.manifoldcf.db value=true/
 property name=org.apache.manifoldcf.connectors value=true/

 Thanks,
 Farzad.





Re: Turning on Logging

2011-05-24 Thread hokie
Got it working :) and the threads have ids! can remove my hack.  
Thought,
I think it would be nice if the sample app property.xml file had all 
the loggers on at warn level.  I also think we should modify the 
logging.ini file to contain, the main diff is file spanning and printing 
out the ids.  Want me to make a patch?


log4j.rootLogger=DEBUG, MAIN
log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
log4j.appender.MAIN.File=logs/manifoldcf.log
log4j.appender.MAIN.MaxFileSize=1KB
log4j.appender.MAIN.MaxBackupIndex=10
log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n


On Tue, 24 May 2011 15:20:13 -0400, Karl Wright daddy...@gmail.com 
wrote:

The rootLogger in the .ini file may be set to DEBUG, but the
properties.xml loggers explicitly control each subsystem, so those 
win

(for subsystems).

Karl


On Tue, May 24, 2011 at 3:17 PM,  ho...@farzad.net wrote:
So then not only I can control the logging from the logging.ini 
file, I can
control it via the property.  Meaning, if the property is set to 
WARN, but

the ini file says DEBUG for the appender, all I get is WARN?

Here is what I have in my .ini file:

#Two appenders defined, one for a file log (MAIN) and the other to 
standard

out (STDOUT)
log4j.rootLogger=DEBUG, MAIN, STDOUT
log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
log4j.appender.MAIN.File=logs/manifoldcf.log
log4j.appender.MAIN.MaxFileSize=1KB
log4j.appender.MAIN.MaxBackupIndex=10
log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - 
%m%n


log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout
log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - 
%m%n



On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com 
wrote:


property name=org.apache.manifoldcf.db value=DEBUG/
 property name=org.apache.manifoldcf.connectors value=DEBUG/

Karl

On Tue, May 24, 2011 at 3:08 PM,  ho...@farzad.net wrote:


What is the properties.xml line to turn on logging say for 
connectors and
db.  Is it the following?  Do you say value = true or enabled or 
on?


property name=org.apache.manifoldcf.db value=true/
property name=org.apache.manifoldcf.connectors value=true/

Thanks,
Farzad.








Re: Turning on Logging

2011-05-24 Thread Karl Wright
The sample app logging.ini file was already updated; if you think it's
not right still please open a new ticket for it and attach your patch.

Adding the loggers to the properties.xml file is fine by me; the
default value is already WARN.  Usually when this is done in other
projects people include the appropriate property statement but have it
commented out.  Feel free to come up with a separate ticket and patch
for that issue, though.


Karl

On Tue, May 24, 2011 at 3:28 PM,  ho...@farzad.net wrote:
 Got it working :) and the threads have ids! can remove my hack.  Thought,
 I think it would be nice if the sample app property.xml file had all the
 loggers on at warn level.  I also think we should modify the logging.ini
 file to contain, the main diff is file spanning and printing out the ids.
  Want me to make a patch?

 log4j.rootLogger=DEBUG, MAIN
 log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
 log4j.appender.MAIN.File=logs/manifoldcf.log
 log4j.appender.MAIN.MaxFileSize=1KB
 log4j.appender.MAIN.MaxBackupIndex=10
 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
 log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n


 On Tue, 24 May 2011 15:20:13 -0400, Karl Wright daddy...@gmail.com wrote:

 The rootLogger in the .ini file may be set to DEBUG, but the
 properties.xml loggers explicitly control each subsystem, so those win
 (for subsystems).

 Karl


 On Tue, May 24, 2011 at 3:17 PM,  ho...@farzad.net wrote:

 So then not only I can control the logging from the logging.ini file, I
 can
 control it via the property.  Meaning, if the property is set to WARN,
 but
 the ini file says DEBUG for the appender, all I get is WARN?

 Here is what I have in my .ini file:

 #Two appenders defined, one for a file log (MAIN) and the other to
 standard
 out (STDOUT)
 log4j.rootLogger=DEBUG, MAIN, STDOUT
 log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
 log4j.appender.MAIN.File=logs/manifoldcf.log
 log4j.appender.MAIN.MaxFileSize=1KB
 log4j.appender.MAIN.MaxBackupIndex=10
 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
 log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n

 log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
 log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout
 log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n


 On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com
 wrote:

 property name=org.apache.manifoldcf.db value=DEBUG/
  property name=org.apache.manifoldcf.connectors value=DEBUG/

 Karl

 On Tue, May 24, 2011 at 3:08 PM,  ho...@farzad.net wrote:

 What is the properties.xml line to turn on logging say for connectors
 and
 db.  Is it the following?  Do you say value = true or enabled or on?

 property name=org.apache.manifoldcf.db value=true/
 property name=org.apache.manifoldcf.connectors value=true/

 Thanks,
 Farzad.







Re: Turning on Logging

2011-05-24 Thread hokie
I like the comment idea, my goal is leave a syntax sample somewhere.  
Updated my source the logging.ini file is fine.


I have a problem with logging, so I don't see any of my log messages at 
all.


I imported org.apache.manifoldcf.crawler.system.Logging and have the 
following lines in my constructor.  I see the System.out print, but not 
the log message.  Thoughts?


Logging.connectors.debug(DupFinderConnector Constructor 
Called);
System.out.println(DUP CONS CALLED);

Thanks!

On Tue, 24 May 2011 15:41:07 -0400, Karl Wright daddy...@gmail.com 
wrote:
The sample app logging.ini file was already updated; if you think 
it's
not right still please open a new ticket for it and attach your 
patch.


Adding the loggers to the properties.xml file is fine by me; the
default value is already WARN.  Usually when this is done in other
projects people include the appropriate property statement but have 
it

commented out.  Feel free to come up with a separate ticket and patch
for that issue, though.


Karl

On Tue, May 24, 2011 at 3:28 PM,  ho...@farzad.net wrote:
Got it working :) and the threads have ids! can remove my hack. 
 Thought,
I think it would be nice if the sample app property.xml file had all 
the
loggers on at warn level.  I also think we should modify the 
logging.ini
file to contain, the main diff is file spanning and printing out the 
ids.

 Want me to make a patch?

log4j.rootLogger=DEBUG, MAIN
log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
log4j.appender.MAIN.File=logs/manifoldcf.log
log4j.appender.MAIN.MaxFileSize=1KB
log4j.appender.MAIN.MaxBackupIndex=10
log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - 
%m%n



On Tue, 24 May 2011 15:20:13 -0400, Karl Wright daddy...@gmail.com 
wrote:


The rootLogger in the .ini file may be set to DEBUG, but the
properties.xml loggers explicitly control each subsystem, so those 
win

(for subsystems).

Karl


On Tue, May 24, 2011 at 3:17 PM,  ho...@farzad.net wrote:


So then not only I can control the logging from the logging.ini 
file, I

can
control it via the property.  Meaning, if the property is set to 
WARN,

but
the ini file says DEBUG for the appender, all I get is WARN?

Here is what I have in my .ini file:

#Two appenders defined, one for a file log (MAIN) and the other to
standard
out (STDOUT)
log4j.rootLogger=DEBUG, MAIN, STDOUT
log4j.appender.MAIN=org.apache.log4j.RollingFileAppender
log4j.appender.MAIN.File=logs/manifoldcf.log
log4j.appender.MAIN.MaxFileSize=1KB
log4j.appender.MAIN.MaxBackupIndex=10
log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout
log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - 
%m%n


log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender
log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout
log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) 
- %m%n



On Tue, 24 May 2011 15:10:16 -0400, Karl Wright 
daddy...@gmail.com

wrote:


property name=org.apache.manifoldcf.db value=DEBUG/
 property name=org.apache.manifoldcf.connectors 
value=DEBUG/


Karl

On Tue, May 24, 2011 at 3:08 PM,  ho...@farzad.net wrote:


What is the properties.xml line to turn on logging say for 
connectors

and
db.  Is it the following?  Do you say value = true or enabled or 
on?


property name=org.apache.manifoldcf.db value=true/
property name=org.apache.manifoldcf.connectors value=true/

Thanks,
Farzad.