[ 
https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17778206#comment-17778206
 ] 

Hudson commented on NUTCH-3013:
-------------------------------

SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #134 (See 
[https://ci-builds.apache.org/job/Nutch/job/Nutch-trunk/134/])
NUTCH-3013 Employ commons-lang3's StopWatch to simplify timing logic (#788) 
(github: 
[https://github.com/apache/nutch/commit/8431dcfe52f5395a0fd9e3c00db009dbb2bcf6f5])
* (edit) src/java/org/apache/nutch/crawl/Injector.java
* (edit) 
src/plugin/lib-regex-filter/src/test/org/apache/nutch/urlfilter/api/RegexURLFilterBaseTest.java
* (edit) src/java/org/apache/nutch/parse/ParseSegment.java
* (edit) src/java/org/apache/nutch/tools/arc/ArcSegmentCreator.java
* (edit) .gitignore
* (edit) src/java/org/apache/nutch/util/domain/DomainStatistics.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/NodeDumper.java
* (edit) 
src/plugin/urlnormalizer-regex/src/test/org/apache/nutch/net/urlnormalizer/regex/TestRegexURLNormalizer.java
* (edit) src/java/org/apache/nutch/util/CrawlCompletionStats.java
* (edit) src/java/org/apache/nutch/tools/FreeGenerator.java
* (edit) src/java/org/apache/nutch/hostdb/ReadHostDb.java
* (edit) src/java/org/apache/nutch/indexer/IndexingJob.java
* (edit) .github/workflows/master-build.yml
* (edit) src/java/org/apache/nutch/util/ProtocolStatusStatistics.java
* (edit) src/java/org/apache/nutch/fetcher/Fetcher.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/LinkDumper.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/ScoreUpdater.java
* (edit) src/java/org/apache/nutch/indexer/CleaningJob.java
* (edit) src/java/org/apache/nutch/crawl/CrawlDbMerger.java
* (edit) src/java/org/apache/nutch/hostdb/UpdateHostDb.java
* (edit) src/java/org/apache/nutch/crawl/DeduplicationJob.java
* (edit) src/java/org/apache/nutch/crawl/CrawlDb.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/LinkRank.java
* (edit) src/java/org/apache/nutch/util/SitemapProcessor.java
* (edit) src/java/org/apache/nutch/crawl/LinkDbReader.java
* (edit) src/java/org/apache/nutch/crawl/Generator.java
* (edit) src/java/org/apache/nutch/crawl/LinkDbMerger.java
* (edit) src/java/org/apache/nutch/scoring/webgraph/WebGraph.java
* (edit) src/java/org/apache/nutch/tools/warc/WARCExporter.java
* (edit) src/java/org/apache/nutch/crawl/LinkDb.java


> Employ commons-lang3's StopWatch to simplify timing logic
> ---------------------------------------------------------
>
>                 Key: NUTCH-3013
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3013
>             Project: Nutch
>          Issue Type: Improvement
>          Components: logging, runtime, util
>    Affects Versions: 1.19
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>              Labels: timing
>             Fix For: 1.20
>
>
> I ended up running some experiments integrating Nutch and [Celeborn 
> (Incubating)|https://celeborn.apache.org/] and it got me thinking about 
> runtime timings. After some investigation I came across [common-lang3's 
> StopWatch 
> Class|https://commons.apache.org/proper/commons-lang/javadocs/api-release/index.html?org/apache/commons/lang3/time/StopWatch.html]
>  which provides a convenient API for timings.
> Seeing as we already declare the commons-lang3 dependency, I think StopWatch 
> could help us clean up some timing logic in Nutch. Specifically, it would 
> reduce redundancy in terms of duplicated code and logic. It would also open 
> the door to introduce timing _*splits*_ if anyone is so inclined to dig 
> deeper into runtime timings.
> A cursory search for *_"long start = System.currentTimeMillis();"_* returns 
> hits for 32 files so it's fair to say that timing already affects lots of 
> aspects of the Nutch execution workflow.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to