[jira] [Commented] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038507#comment-13038507 ] Jan Høydahl commented on CONNECTORS-19: --- Yea, guess the net effect is about the same if MCF handles the threads or SolrJ does. Guess we could set threadCount=1 and make buffer size configurable. The point of switching to SolrJ would be the assumption that code is more stable and performant. Also SOLR-1565 could make things even faster. Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Reporter: Karl Wright Priority: Minor The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038521#comment-13038521 ] Karl Wright commented on CONNECTORS-19: --- That's why this ticket was created - to explore using solrj instead of the homegrown code currently in the connector. However, there are issues we need to consider before solrj would be an option. The guaranteed delivery problem is one such. But also if SolrJ spins up its own threads it might well make it difficult to shut ManifoldCF down properly, depending on how those threads are created. Just as it is better to use an application server's thread pool when you are a web application, the same principles apply for threads created by connectors and their supporting libraries. If you have access to ManifoldCF in Action, you might want to have a look at chapters 5 and 6 for details. However, that does not rule solrj out, it just means we need to be cautious if and when the Solr connector is transitioned to use it. If you want to explore this in detail by all means feel free - patches are definitely welcome. Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Reporter: Karl Wright Priority: Minor The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038651#comment-13038651 ] Jan Høydahl commented on CONNECTORS-202: I've created a SOLR patch to allow commitWtihin as a request parameter. I guess this means that on the MCF side we could simply set a Name/Value pair on the SolrOutputConnector or change from /update/extract to /update/extract?commitWithin=1. But probably for usability's sake it makes sense to state it as an explicit param on the Commits tab below Commit at end of every job checkbox. SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: commit The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038658#comment-13038658 ] Karl Wright commented on CONNECTORS-202: Yes, making it explicit is preferred. But I thought you wanted to be able to set this on a per-job basis? SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: commit The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038666#comment-13038666 ] Jan Høydahl commented on CONNECTORS-202: Right, that would be the best. If the param is set per job it should override the default on the output connector. This could be a pet project for me to contribute something simple to MCF :) SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: commit The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Turning on Logging
What is the properties.xml line to turn on logging say for connectors and db. Is it the following? Do you say value = true or enabled or on? property name=org.apache.manifoldcf.db value=true/ property name=org.apache.manifoldcf.connectors value=true/ Thanks, Farzad.
Re: Turning on Logging
The rootLogger in the .ini file may be set to DEBUG, but the properties.xml loggers explicitly control each subsystem, so those win (for subsystems). Karl On Tue, May 24, 2011 at 3:17 PM, ho...@farzad.net wrote: So then not only I can control the logging from the logging.ini file, I can control it via the property. Meaning, if the property is set to WARN, but the ini file says DEBUG for the appender, all I get is WARN? Here is what I have in my .ini file: #Two appenders defined, one for a file log (MAIN) and the other to standard out (STDOUT) log4j.rootLogger=DEBUG, MAIN, STDOUT log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com wrote: property name=org.apache.manifoldcf.db value=DEBUG/ property name=org.apache.manifoldcf.connectors value=DEBUG/ Karl On Tue, May 24, 2011 at 3:08 PM, ho...@farzad.net wrote: What is the properties.xml line to turn on logging say for connectors and db. Is it the following? Do you say value = true or enabled or on? property name=org.apache.manifoldcf.db value=true/ property name=org.apache.manifoldcf.connectors value=true/ Thanks, Farzad.
Re: Turning on Logging
Got it working :) and the threads have ids! can remove my hack. Thought, I think it would be nice if the sample app property.xml file had all the loggers on at warn level. I also think we should modify the logging.ini file to contain, the main diff is file spanning and printing out the ids. Want me to make a patch? log4j.rootLogger=DEBUG, MAIN log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:20:13 -0400, Karl Wright daddy...@gmail.com wrote: The rootLogger in the .ini file may be set to DEBUG, but the properties.xml loggers explicitly control each subsystem, so those win (for subsystems). Karl On Tue, May 24, 2011 at 3:17 PM, ho...@farzad.net wrote: So then not only I can control the logging from the logging.ini file, I can control it via the property. Meaning, if the property is set to WARN, but the ini file says DEBUG for the appender, all I get is WARN? Here is what I have in my .ini file: #Two appenders defined, one for a file log (MAIN) and the other to standard out (STDOUT) log4j.rootLogger=DEBUG, MAIN, STDOUT log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com wrote: property name=org.apache.manifoldcf.db value=DEBUG/ property name=org.apache.manifoldcf.connectors value=DEBUG/ Karl On Tue, May 24, 2011 at 3:08 PM, ho...@farzad.net wrote: What is the properties.xml line to turn on logging say for connectors and db. Is it the following? Do you say value = true or enabled or on? property name=org.apache.manifoldcf.db value=true/ property name=org.apache.manifoldcf.connectors value=true/ Thanks, Farzad.
Re: Turning on Logging
The sample app logging.ini file was already updated; if you think it's not right still please open a new ticket for it and attach your patch. Adding the loggers to the properties.xml file is fine by me; the default value is already WARN. Usually when this is done in other projects people include the appropriate property statement but have it commented out. Feel free to come up with a separate ticket and patch for that issue, though. Karl On Tue, May 24, 2011 at 3:28 PM, ho...@farzad.net wrote: Got it working :) and the threads have ids! can remove my hack. Thought, I think it would be nice if the sample app property.xml file had all the loggers on at warn level. I also think we should modify the logging.ini file to contain, the main diff is file spanning and printing out the ids. Want me to make a patch? log4j.rootLogger=DEBUG, MAIN log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:20:13 -0400, Karl Wright daddy...@gmail.com wrote: The rootLogger in the .ini file may be set to DEBUG, but the properties.xml loggers explicitly control each subsystem, so those win (for subsystems). Karl On Tue, May 24, 2011 at 3:17 PM, ho...@farzad.net wrote: So then not only I can control the logging from the logging.ini file, I can control it via the property. Meaning, if the property is set to WARN, but the ini file says DEBUG for the appender, all I get is WARN? Here is what I have in my .ini file: #Two appenders defined, one for a file log (MAIN) and the other to standard out (STDOUT) log4j.rootLogger=DEBUG, MAIN, STDOUT log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com wrote: property name=org.apache.manifoldcf.db value=DEBUG/ property name=org.apache.manifoldcf.connectors value=DEBUG/ Karl On Tue, May 24, 2011 at 3:08 PM, ho...@farzad.net wrote: What is the properties.xml line to turn on logging say for connectors and db. Is it the following? Do you say value = true or enabled or on? property name=org.apache.manifoldcf.db value=true/ property name=org.apache.manifoldcf.connectors value=true/ Thanks, Farzad.
Re: Turning on Logging
I like the comment idea, my goal is leave a syntax sample somewhere. Updated my source the logging.ini file is fine. I have a problem with logging, so I don't see any of my log messages at all. I imported org.apache.manifoldcf.crawler.system.Logging and have the following lines in my constructor. I see the System.out print, but not the log message. Thoughts? Logging.connectors.debug(DupFinderConnector Constructor Called); System.out.println(DUP CONS CALLED); Thanks! On Tue, 24 May 2011 15:41:07 -0400, Karl Wright daddy...@gmail.com wrote: The sample app logging.ini file was already updated; if you think it's not right still please open a new ticket for it and attach your patch. Adding the loggers to the properties.xml file is fine by me; the default value is already WARN. Usually when this is done in other projects people include the appropriate property statement but have it commented out. Feel free to come up with a separate ticket and patch for that issue, though. Karl On Tue, May 24, 2011 at 3:28 PM, ho...@farzad.net wrote: Got it working :) and the threads have ids! can remove my hack. Thought, I think it would be nice if the sample app property.xml file had all the loggers on at warn level. I also think we should modify the logging.ini file to contain, the main diff is file spanning and printing out the ids. Want me to make a patch? log4j.rootLogger=DEBUG, MAIN log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:20:13 -0400, Karl Wright daddy...@gmail.com wrote: The rootLogger in the .ini file may be set to DEBUG, but the properties.xml loggers explicitly control each subsystem, so those win (for subsystems). Karl On Tue, May 24, 2011 at 3:17 PM, ho...@farzad.net wrote: So then not only I can control the logging from the logging.ini file, I can control it via the property. Meaning, if the property is set to WARN, but the ini file says DEBUG for the appender, all I get is WARN? Here is what I have in my .ini file: #Two appenders defined, one for a file log (MAIN) and the other to standard out (STDOUT) log4j.rootLogger=DEBUG, MAIN, STDOUT log4j.appender.MAIN=org.apache.log4j.RollingFileAppender log4j.appender.MAIN.File=logs/manifoldcf.log log4j.appender.MAIN.MaxFileSize=1KB log4j.appender.MAIN.MaxBackupIndex=10 log4j.appender.MAIN.layout=org.apache.log4j.PatternLayout log4j.appender.MAIN.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n log4j.appender.STDOUT=org.apache.log4j.ConsoleAppender log4j.appender.STDOUT.layout=org.apache.log4j.PatternLayout log4j.appender.STDOUT.layout.ConversionPattern=%d %5p [%t] (%F:%L) - %m%n On Tue, 24 May 2011 15:10:16 -0400, Karl Wright daddy...@gmail.com wrote: property name=org.apache.manifoldcf.db value=DEBUG/ property name=org.apache.manifoldcf.connectors value=DEBUG/ Karl On Tue, May 24, 2011 at 3:08 PM, ho...@farzad.net wrote: What is the properties.xml line to turn on logging say for connectors and db. Is it the following? Do you say value = true or enabled or on? property name=org.apache.manifoldcf.db value=true/ property name=org.apache.manifoldcf.connectors value=true/ Thanks, Farzad.