[jira] [Commented] (NUTCH-2233) Index-basic incorrect assignment of next fetch time when using Mongodb as storage backend

2017-01-19 Thread Wallace Xia (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831030#comment-15831030
 ] 

Wallace Xia commented on NUTCH-2233:


Hi,how did you resovle this problem, cannot comment it? Thanks.

> Index-basic incorrect assignment of next fetch time when using Mongodb as 
> storage backend
> -
>
> Key: NUTCH-2233
> URL: https://issues.apache.org/jira/browse/NUTCH-2233
> Project: Nutch
>  Issue Type: Bug
>  Components: plugin
>Affects Versions: 2.3.1
> Environment: Mongodb, Elasticsearch.
>Reporter: Pablo Torres
>
> This patch https://issues.apache.org/jira/browse/NUTCH-2045 does not work 
> when using Mongodb as storage since date properties are stored as Longs in 
> mongodb rather than objects, therefore the null date in this case is 0 which 
> is accepted as valid by this patch. The system indexes 01/01/1970 as tstamp.
> I found this issue using Mongodb as storage and Elastic Search as index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2352) Log with Generic Class Name at Nutch 1.x

2017-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830758#comment-15830758
 ] 

Hudson commented on NUTCH-2352:
---

SUCCESS: Integrated in Jenkins build Nutch-trunk #3404 (See 
[https://builds.apache.org/job/Nutch-trunk/3404/])
NUTCH-2352 Logging with generic class name, closes #172 (snagel: rev 
2b93a66f0472e93223c69053d5482dcbef26de6d)
* (edit) 
src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/cosine/Model.java
* (edit) src/java/org/apache/nutch/fetcher/FetcherThread.java
* (edit) 
src/plugin/protocol-selenium/src/java/org/apache/nutch/protocol/selenium/Http.java
* (edit) 
src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java
* (edit) src/java/org/apache/nutch/service/NutchServer.java
* (edit) src/java/org/apache/nutch/fetcher/FetchItemQueue.java
* (edit) 
src/plugin/urlmeta/src/java/org/apache/nutch/indexer/urlmeta/URLMetaIndexingFilter.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/WebGraph.java
* (edit) 
src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/DummyX509TrustManager.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/ScoreUpdater.java
* (edit) 
src/plugin/urlnormalizer-protocol/src/java/org/apache/nutch/net/urlnormalizer/protocol/ProtocolURLNormalizer.java
* (edit) src/java/org/apache/nutch/net/URLNormalizers.java
* (edit) src/java/org/apache/nutch/crawl/CrawlDbMerger.java
* (edit) 
src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpRobotRulesParser.java
* (edit) src/java/org/apache/nutch/parse/ParseUtil.java
* (edit) src/plugin/feed/src/java/org/apache/nutch/parse/feed/FeedParser.java
* (edit) src/java/org/apache/nutch/crawl/CrawlDb.java
* (edit) src/test/org/apache/nutch/tools/proxy/ProxyTestbed.java
* (edit) 
src/plugin/lib-regex-filter/src/test/org/apache/nutch/urlfilter/api/RegexURLFilterBaseTest.java
* (edit) src/java/org/apache/nutch/service/impl/JobWorker.java
* (edit) 
src/plugin/indexer-elastic/src/java/org/apache/nutch/indexwriter/elastic/ElasticIndexWriter.java
* (edit) src/plugin/parse-zip/src/java/org/apache/nutch/parse/zip/ZipParser.java
* (edit) 
src/plugin/parse-js/src/java/org/apache/nutch/parse/js/JSParseFilter.java
* (edit) 
src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/FtpRobotRulesParser.java
* (edit) src/java/org/apache/nutch/parse/ParseResult.java
* (edit) src/java/org/apache/nutch/fetcher/QueueFeeder.java
* (edit) src/java/org/apache/nutch/parse/ParseSegment.java
* (edit) src/java/org/apache/nutch/hostdb/UpdateHostDbMapper.java
* (edit) 
src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
* (edit) src/java/org/apache/nutch/util/domain/DomainSuffixesReader.java
* (edit) 
src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer/basic/BasicURLNormalizer.java
* (edit) src/java/org/apache/nutch/indexer/IndexingFilters.java
* (edit) src/java/org/apache/nutch/util/DomUtil.java
* (edit) 
src/plugin/urlmeta/src/java/org/apache/nutch/scoring/urlmeta/URLMetaScoringFilter.java
* (edit) 
src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/HttpAuthenticationFactory.java
* (edit) src/java/org/apache/nutch/crawl/Injector.java
* (edit) 
src/plugin/parse-html/src/test/org/apache/nutch/parse/html/TestHtmlParser.java
* (edit) src/java/org/apache/nutch/util/domain/DomainStatistics.java
* (edit) 
src/plugin/publish-rabbitmq/src/java/org/apache/nutch/publisher/rabbitmq/RabbitMQPublisherImpl.java
* (edit) 
src/plugin/urlnormalizer-querystring/src/java/org/apache/nutch/net/urlnormalizer/querystring/QuerystringURLNormalizer.java
* (edit) 
src/plugin/protocol-interactiveselenium/src/java/org/apache/nutch/protocol/interactiveselenium/handlers/DefaultClickAllAjaxLinksHandler.java
* (edit) src/test/org/apache/nutch/crawl/TODOTestCrawlDbStates.java
* (edit) src/java/org/apache/nutch/tools/FileDumper.java
* (edit) src/java/org/apache/nutch/segment/SegmentMergeFilters.java
* (edit) src/java/org/apache/nutch/webui/service/impl/NutchServiceImpl.java
* (edit) src/java/org/apache/nutch/hostdb/ReadHostDb.java
* (edit) src/test/org/apache/nutch/service/TestNutchServer.java
* (edit) src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
* (edit) 
src/plugin/urlfilter-domain/src/java/org/apache/nutch/urlfilter/domain/DomainURLFilter.java
* (edit) 
src/plugin/urlnormalizer-regex/src/test/org/apache/nutch/net/urlnormalizer/regex/TestRegexURLNormalizer.java
* (edit) src/java/org/apache/nutch/util/ProtocolStatusStatistics.java
* (edit) 
src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/cosine/CosineSimilarity.java
* (edit) src/test/org/apache/nutch/crawl/CrawlDbUpdateUtil.java
* (edit) 
src/plugin/urlfilter-ignoreexempt/src/java/org/apache/nutch/urlfilter/ignoreexempt/ExemptionUrlFilter.java
* (edit) 
src/plugin/protocol-httpclient/src/java/org/apache/nu

[jira] [Commented] (NUTCH-2351) Log with Generic Class Name at Nutch 2.x

2017-01-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830740#comment-15830740
 ] 

Hudson commented on NUTCH-2351:
---

SUCCESS: Integrated in Jenkins build Nutch-nutchgora #1578 (See 
[https://builds.apache.org/job/Nutch-nutchgora/1578/])
NUTCH-2351 Logging with generic class name. (snagel: rev 
1a84334c115bfda16980cd822da31ba5ae401afe)
* (edit) src/java/org/apache/nutch/fetcher/FetcherReducer.java
* (edit) 
src/plugin/urlfilter-suffix/src/java/org/apache/nutch/urlfilter/suffix/SuffixURLFilter.java
* (edit) src/java/org/apache/nutch/crawl/AbstractFetchSchedule.java
* (edit) src/java/org/apache/nutch/util/domain/DomainStatistics.java
* (edit) 
src/plugin/index-html/src/java/org/apache/nutch/indexer/html/HtmlIndexingFilter.java
* (edit) 
src/plugin/subcollection/src/java/org/apache/nutch/indexer/subcollection/SubcollectionIndexingFilter.java
* (edit) 
src/plugin/parse-tika/src/test/org/apache/nutch/parse/tika/DOMContentUtilsTest.java
* (edit) src/java/org/apache/nutch/indexer/IndexingFiltersChecker.java
* (edit) src/java/org/apache/nutch/tools/DmozParser.java
* (edit) src/java/org/apache/nutch/webui/client/impl/CrawlingCycle.java
* (edit) 
src/plugin/language-identifier/src/java/org/apache/nutch/analysis/lang/HTMLLanguageParser.java
* (edit) 
src/plugin/lib-regex-filter/src/java/org/apache/nutch/urlfilter/api/RegexURLFilterBase.java
* (edit) 
src/plugin/creativecommons/src/java/org/creativecommons/nutch/CCIndexingFilter.java
* (edit) 
src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java
* (edit) src/java/org/apache/nutch/tools/Benchmark.java
* (edit) src/java/org/apache/nutch/util/MimeUtil.java
* (edit) src/plugin/parse-zip/src/java/org/apache/nutch/parse/zip/ZipParser.java
* (edit) src/plugin/parse-ext/src/java/org/apache/nutch/parse/ext/ExtParser.java
* (edit) src/java/org/apache/nutch/host/HostDbReader.java
* (edit) src/java/org/apache/nutch/tools/proxy/LogDebugHandler.java
* (edit) 
src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java
* (edit) src/plugin/parse-swf/src/java/org/apache/nutch/parse/swf/SWFParser.java
* (edit) src/java/org/apache/nutch/net/URLNormalizers.java
* (edit) src/java/org/apache/nutch/host/HostInjectorJob.java
* (edit) src/java/org/apache/nutch/plugin/PluginDescriptor.java
* (edit) 
src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/HttpRobotRulesParser.java
* (edit) src/java/org/apache/nutch/util/DomUtil.java
* (edit) 
src/plugin/parse-tika/src/java/org/apache/nutch/parse/tika/TikaParser.java
* (edit) src/java/org/apache/nutch/parse/ParseUtil.java
* (edit) src/java/org/apache/nutch/util/GZIPUtils.java
* (edit) src/java/org/apache/nutch/host/HostDb.java
* (edit) src/java/org/apache/nutch/indexer/IndexUtil.java
* (edit) src/java/org/apache/nutch/indexer/IndexWriters.java
* (edit) src/java/org/apache/nutch/host/HostDbUpdateJob.java
* (edit) src/java/org/apache/nutch/webui/service/impl/NutchServiceImpl.java
* (edit) 
src/plugin/index-anchor/src/java/org/apache/nutch/indexer/anchor/AnchorIndexingFilter.java
* (edit) src/java/org/apache/nutch/api/resources/AdminResource.java
* (edit) src/java/org/apache/nutch/api/impl/JobWorker.java
* (edit) src/java/org/apache/nutch/plugin/PluginManifestParser.java
* (edit) src/java/org/apache/nutch/webui/client/impl/RemoteCommandExecutor.java
* (edit) src/java/org/apache/nutch/crawl/SignatureFactory.java
* (edit) src/java/org/apache/nutch/parse/ParserJob.java
* (edit) src/java/org/apache/nutch/parse/ParserFactory.java
* (edit) 
src/plugin/lib-regex-filter/src/test/org/apache/nutch/urlfilter/api/RegexURLFilterBaseTest.java
* (edit) 
src/plugin/urlnormalizer-basic/src/java/org/apache/nutch/net/urlnormalizer/basic/BasicURLNormalizer.java
* (edit) src/java/org/apache/nutch/crawl/WebTableReader.java
* (edit) 
src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java
* (edit) 
src/plugin/urlfilter-domain/src/java/org/apache/nutch/urlfilter/domain/DomainURLFilter.java
* (edit) src/java/org/apache/nutch/parse/ParserChecker.java
* (edit) src/java/org/apache/nutch/api/NutchServer.java
* (edit) src/java/org/apache/nutch/indexer/IndexingFilters.java
* (edit) src/java/org/apache/nutch/util/ObjectCache.java
* (edit) 
src/plugin/protocol-ftp/src/java/org/apache/nutch/protocol/ftp/FtpRobotRulesParser.java
* (edit) 
src/plugin/tld/src/java/org/apache/nutch/indexer/tld/TLDIndexingFilter.java
* (edit) 
src/plugin/protocol-httpclient/src/java/org/apache/nutch/protocol/httpclient/DummySSLProtocolSocketFactory.java
* (edit) src/java/org/apache/nutch/api/security/SecurityUtils.java
* (edit) src/java/org/apache/nutch/indexer/solr/SolrDeleteDuplicates.java
* (edit) src/java/org/apache/nutch/plugin/PluginRepository.java
* (edit) src/java/org/apache/nutch/util/EncodingDetector.java
* (edit) 
src/plugin/parse-html/src/test/org/apache/nutch/parse/html/TestHtmlPars

[jira] [Resolved] (NUTCH-2351) Log with Generic Class Name at Nutch 2.x

2017-01-19 Thread Sebastian Nagel (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-2351.

Resolution: Fixed

Committed to 2.x, thanks [~kamaci]!

> Log with Generic Class Name at Nutch 2.x
> 
>
> Key: NUTCH-2351
> URL: https://issues.apache.org/jira/browse/NUTCH-2351
> Project: Nutch
>  Issue Type: Improvement
>Affects Versions: 2.3.1
>Reporter: Furkan KAMACI
>Assignee: Furkan KAMACI
>Priority: Minor
> Fix For: 2.4
>
>
> There are many mistakes when some reference code is copied and created a new 
> class and a logger is used. We can log with a generic class name to avoid it 
> as like:
> {code:java}
> private static final Logger LOG = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
> {code}
> (cf. SOLR-8324)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (NUTCH-2352) Log with Generic Class Name at Nutch 1.x

2017-01-19 Thread Sebastian Nagel (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sebastian Nagel resolved NUTCH-2352.

Resolution: Fixed

Committed to 1.x. Thanks, [~kamaci]!

> Log with Generic Class Name at Nutch 1.x
> 
>
> Key: NUTCH-2352
> URL: https://issues.apache.org/jira/browse/NUTCH-2352
> Project: Nutch
>  Issue Type: Improvement
>Affects Versions: 1.12
>Reporter: Furkan KAMACI
>Assignee: Furkan KAMACI
>Priority: Minor
> Fix For: 1.13
>
>
> There are many mistakes when some reference code is copied and created a new 
> class and a logger is used. We can log with a generic class name to avoid it 
> as like:
> {code:java}
> private static final Logger LOG = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
> {code}
> (cf. SOLR-8324)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2352) Log with Generic Class Name at Nutch 1.x

2017-01-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830663#comment-15830663
 ] 

ASF GitHub Bot commented on NUTCH-2352:
---

Github user asfgit closed the pull request at:

https://github.com/apache/nutch/pull/172


> Log with Generic Class Name at Nutch 1.x
> 
>
> Key: NUTCH-2352
> URL: https://issues.apache.org/jira/browse/NUTCH-2352
> Project: Nutch
>  Issue Type: Improvement
>Affects Versions: 1.12
>Reporter: Furkan KAMACI
>Assignee: Furkan KAMACI
>Priority: Minor
> Fix For: 1.13
>
>
> There are many mistakes when some reference code is copied and created a new 
> class and a logger is used. We can log with a generic class name to avoid it 
> as like:
> {code:java}
> private static final Logger LOG = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
> {code}
> (cf. SOLR-8324)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] nutch pull request #172: NUTCH-2352 Logging with generic class name.

2017-01-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nutch/pull/172


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (NUTCH-2352) Log with Generic Class Name at Nutch 1.x

2017-01-19 Thread Sebastian Nagel (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830548#comment-15830548
 ] 

Sebastian Nagel commented on NUTCH-2352:


+1 lgtm, going to commit...

> Log with Generic Class Name at Nutch 1.x
> 
>
> Key: NUTCH-2352
> URL: https://issues.apache.org/jira/browse/NUTCH-2352
> Project: Nutch
>  Issue Type: Improvement
>Affects Versions: 1.12
>Reporter: Furkan KAMACI
>Assignee: Furkan KAMACI
>Priority: Minor
> Fix For: 1.13
>
>
> There are many mistakes when some reference code is copied and created a new 
> class and a logger is used. We can log with a generic class name to avoid it 
> as like:
> {code:java}
> private static final Logger LOG = 
> LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
> {code}
> (cf. SOLR-8324)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)