+1 On 14 June 2018 at 03:09, Furkan KAMACI <[email protected]> wrote:
> +1 > > > 13 Haz 2018 Çar, saat 21:04 tarihinde Joe Obernberger < > [email protected]> şunu yazdı: > >> Woot! >> >> >> >> On 6/11/2018 11:55 AM, Chris Mattmann wrote: >> > ++1! >> > >> > >> > >> > Sounds great. >> > >> > >> > >> > Cheers, >> > >> > Chris >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > From: Sebastian Nagel <[email protected]> >> > Reply-To: "[email protected]" <[email protected]> >> > Date: Monday, June 11, 2018 at 7:35 AM >> > To: "[email protected]" <[email protected]> >> > Cc: "[email protected]" <[email protected]> >> > Subject: Preparing to release Nutch 1.15 ? >> > >> > >> > >> > Hi all, >> > >> > >> > >> > almost 80 fixes and improvements are done now and include: >> > >> > >> > >> > NUTCH-2375 upgrade to new mapreduce API >> > >> > It was a huge change affecting more than 10,000 lines of code. >> Thanks, Omkar! >> > >> > Well, there have been some regressions but those are resolved now. >> Tests in >> > >> > pseudo-distributed mode [1] succeeded and also a mid-size test crawl >> (180 >> > >> > million pages) on a Hadoop cluster. >> > >> > Would be great if anybody is able to test the Nutch master in >> combination with >> > >> > a non-HDFS file system (e.g. s3://)! Please let us know whether this >> works. Thanks! >> > >> > >> > >> > NUTCH-1480: Multiple index writer instances with different >> configurations >> > >> > Thanks to Roannel it's now possible to index into multiple Solr or >> Elasticsearch >> > >> > instances. With NUTCH- (needs to be reviewed) also the routing to of >> documents >> > >> > to the index will be configurable. >> > >> > >> > >> > NUTCH-2583: Ralf contributed a huge upgrade of dependencies. >> > >> > Nutch now runs and compiles on Java 9 + 10. Only errors in unit >> tests need >> > >> > to be addressed in NUTCH-2596. >> > >> > >> > >> > And two important issues are almost ready to be committed soon: >> > >> > >> > >> > NUTCH-2549: a long list of fixes and improvements to protocol-http. >> Thanks to >> > >> > Gerard Bouchard! >> > >> > >> > >> > NUTCH-2576: plugin protocol-okhttp, a new HTTP protocol implementation >> based >> > >> > on the okhttp library. Supports HTTP/2. >> > >> > >> > >> > >> > >> > The full list of fixes and improvements is available at [2]. >> > >> > >> > >> > I'll plan to work through the remaining 70 open issues during the next >> > >> > days and hope to commit/resolve 15-25 of them and move the remaining >> > >> > ones to Nutch 1.16. >> > >> > >> > >> > Please vote for issues you want to get included. If there are open >> > >> > pull requests, it will help if these can be merged, the unit tests >> > >> > pass, and any review comments are addressed. Thanks! >> > >> > >> > >> > If there are any objections or blockers, please also let us know! >> > >> > >> > >> > I'll also plan to run a test crawl on Hadoop mid of this week. >> > >> > But any help in testing is welcome. >> > >> > >> > >> > Note that the tutorial needs to be updated (will be done after 1.15 >> > >> > is finally released) to reflect the changes related to NUTCH-1480. >> > >> > >> > >> > >> > >> > Thanks, >> > >> > Sebastian >> > >> > >> > >> > >> > >> > [1] https://github.com/sebastian-nagel/nutch-test-single-node-cluster >> > >> > [2] https://issues.apache.org/jira/projects/NUTCH/versions/12342302 >> > >> > >> > >> > >> > >> > >> > >> > --- >> > This email has been checked for viruses by AVG. >> > https://www.avg.com >> > >> >>

