Re: OAI::Harvester installation help
I am attempting to use some code which depends on Net::OAI::Harvester, but my attempts to install OAI::Harvester are running into problems with: Any suggestions for getting this installed properly? I'm assuming that this is a case where a simple force install isn't going to get me a working installation... I've used Net::OAI::Harvester on both Ubuntu and Windows XP for my projects. On XP I've used Strawberry Perl, in which installing using CPAN luckily worked without any problem. In Ubuntu, I had to install from synaptic when CPAN failed. If I remember correctly, the command was: # sudo apt-get install libnet-oai-harvester-perl It works better than cpan in managing dependencies in my experience. Regards, Saiful Amin
RE: OAI::Harvester installation help
Thanks for the responses! I was also contacted off-list by Net::OAI::Harvester's maintainer and we tracked the issue down to a rogue standalone release of XML::SAX::Base. Removing that, so that the XML::SAX::Base included with XML::SAX was used instead, resolved my problems and I was then able to install Net::OAI::Harvester cleanly. I am not so familiar with the oai harvesting tools in Perl, so forgive me if I am giving you incorrect information. My vague recollection is that there are several oai harvesters for Perl. There are, but the import tool I'm using is specifically built to use Net::OAI::Harvester. In Ubuntu, I had to install from synaptic when CPAN failed. If I remember correctly, the command was: # sudo apt-get install libnet-oai-harvester-perl It works better than cpan in managing dependencies in my experience. Although my dev laptop is Ubuntu, the servers I deploy to tend to be running Very Old Releases of Fedora, so I try to stay within CPAN despite its warts, simply for the sake of broad compatibility. The security of running installation tests is a great bonus, too, though - in this particular case, I expect that installing Ubuntu's packaged version would have gotten me a non-working installation because the rogue XML::SAX::Base would still be there, it just wouldn't have been caught during installation.
Re: OAI::Harvester installation help
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, thanks to further input from the original reporter I (as co-maintainer of N:O:H) have been able to sort out the issues illustrated by the report: - - one of the repositories used in the test suite changed its address recently thus causing some tests to fail: fixed - - The LibXML family of parsers behaves very noisily when it comes to the test for illegal XML making it hard to notice that the test actually succeeds (not fixed) Two issues with XML::SAX might be of broader interest: [not applicable to this thread: - - ParserDetails.ini XML::SAX (::ParserFactory) uses a text file ParserDetails.ini located in the folder SAX.pm resides in or (Debian only?) under /etc/perl/... This file contains the list of known parsers in this installation and their properties. For myself I have noticed several times that this file was not generated (because of non-interactive dependency installs?) and subsequently installed individual parsers were not registered. My impression is that the XML::SAX framework should fall back to XML::SAX::PurePerl installed by the package itself but this does not seem to happen in the Net::OAI::Harvester test suite (maybe because the parser is requested with a required feature). Status: not tackled yet cf. http://perl-xml.sourceforge.net/faq/#parserdetails.ini ] - - latest version of XML::SAX::Base some sub-modules of Net::OAI::Harvester use the get_handler() method supplied by XML::SAX::Base as of version 1.04. This module is literally hidden in the XML::SAX distribution (source is generated by Makefile.PL to prevent indexing of the module on CPAN). There is a strain of standalone versions of XML::SAX::Base on CPAN ending at version 1.02 which does not contain the method in question. (The README of this standalone module gives the warning that you probably do not want to install this module but the complete XML::SAX framework). When you explicitly request installation of XML::SAX::Base there is a probability that this fetches version 1.02 and takes precedence over or actually overwrites version 1.04 installed by XML::SAX and there is absolutely no upgrade path: XML::SAX::Base 1.02 must be uninstalled/removed then for things to work again. For Net::OAI::Harvester I have refined the requirement of XML::SAX::Base to the specific version 1.04 and I'm awaiting the CPAN Tester reports to come in: It might well be that more systems than before are entrapped to install the wrong module when performing Build installdeps and thus effectively cut themselves off from executing the tests at all. Thomas Berger Am 18.05.2011 09:14, schrieb Saiful Amin: I am attempting to use some code which depends on Net::OAI::Harvester, but my attempts to install OAI::Harvester are running into problems with: Any suggestions for getting this installed properly? I'm assuming that this is a case where a simple force install isn't going to get me a working installation... I've used Net::OAI::Harvester on both Ubuntu and Windows XP for my projects. On XP I've used Strawberry Perl, in which installing using CPAN luckily worked without any problem. In Ubuntu, I had to install from synaptic when CPAN failed. If I remember correctly, the command was: # sudo apt-get install libnet-oai-harvester-perl It works better than cpan in managing dependencies in my experience. Regards, Saiful Amin -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iJwEAQECAAYFAk3ThigACgkQYhMlmJ6W47PypwP9HwJSNbwmtOh+3G+Y4wKFJODS r1UiDOGc/TDi5zcgRtEHq8lDTlH/CecHYnJv5IN5rJiW2icykoI1Th5lEKX5K90N s+I8xSpEZXfL5k51hTu7Nql5F8iyF/L7lSyMic3s91/kdAraoDgagcf6pEYg4dRt Lv2MSAS1EPuU6jdU4JQ= =QUFO -END PGP SIGNATURE-
Re: OAI::Harvester installation help
Dave Sherohman writes Hey, all! Long-time Perl programmer, but new to the world of libraries, so I'm not all that familiar with all the data formats used in these parts. I am attempting to use some code which depends on Net::OAI::Harvester, but my attempts to install OAI::Harvester are running into problems with: I am not so familiar with the oai harvesting tools in Perl, so forgive me if I am giving you incorrect information. My vague recollection is that there are several oai harvesters for Perl. The one I use is different one, I think, http::oai. I suggest you try with this. FWIW, I attach a script taht I use to download OAI archives. I used to keep a collection of them, based on an opendoar listing. I think I will soon stop it. I hope this is helpful. Cheers, Thomas Krichelhttp://openlib.org/home/krichel http://authorclaim.org/profile/pkr1 skype: thomaskrichel #!/usr/bin/perl -w use lib '/home/mamf/usr/share/perl/'; use strict; use Data::Dumper; use Data::Random qw(:all); use List::Util qw(shuffle); use File::Basename; use File::Compare; use File::Copy; use File::Find; use File::Path; use File::Listing qw(parse_dir); use File::Temp qw/ tempfile tempdir /; use File::Touch; use LWP::Simple; use HTTP::OAI; use Storable; use XML::DOM; use XML::LibXML; use Time::Piece; use Time::Seconds; # use Sys::RunAlone; ## home-grown use Mamf::Common; ## the size of the files, in terms of OAI_DC records my $batch_size=100; ## directories my $home=$ENV{'HOME'}; my $log_dir=$home/public_html/log; my $amf_file =$home/amf/oa/oa.amf.xml; ## renewal time of 30 days my $renewal_time=30*24*60*60; ## counters my $collection_count=0; my $no_oai_count=0; ## XML and standards constants my $amf_ns='http://amf.openlib.org'; my $doar_ns='http://opendoar.org'; my $freelib_ns='http://3lib.org'; my $collection_prefix='info:3lib:oa:'; ## run parmeter my $verbose=0; ## ## first argument will be an archive to do ## my $to_do_archive=$ARGV[0]; ## ## parse the amf file to find the already existing ## 3lib ids and the oai interfaces, recorded in doar ## ## gives the oai_url for an id my $oai_urls; ## gives the id for an oai_url my $ids; ## gives the rID for an oai_url my $rIDs; ## gives the metadata_formats for an id my $metadata_format; ## ## open log file ## my $date=`date -I`; chomp $date; my $log_file=$log_dir/down_oa_$date.log; open(LOG, $log_file); binmode(LOG,:utf8:); ## populate these varibles, deletes ## archives not to get parse_oa_amf(); ## create in_dirs variable, that contains input ## directories my @in_dirs; ## an indicator of the input directory my $in_dir; foreach my $archive (keys %{$metadata_format}) { my $format=$metadata_format-{$archive}; if(not defined($in_dir-{$format})) { push(@in_dirs,$home/opt/$format/oa/$archive); } ## double meaning array $in_dir-{$archive}=$home/opt/$format/oa/$archive; } if(not $to_do_archive) { harvest_all(); } else { print doing $to_do_archive\n; eval { harvest_to_dir($to_do_archive); } ; } exit; ## ## ## shuffle the oai_url, find what archives to download ## sub harvest_all { my @rand_ids=shuffle(keys %{$oai_urls}) ; my $ineligible=get_ineligble_archives($renewal_time); foreach my $id (@rand_ids) { open(LOG, $log_file); binmode(LOG,:utf8:); my $date=`date --rfc-3339=seconds`; chomp $date; print LOG at: $date ; if($ineligible-{$id}) { print LOG not renewing .$id., rID . $rIDs-{$id}., .$ineligible-{$id}.\n; next; } ## try to catch errors if it bombs out print LOG get: $id, rID $rIDs-{$id} from $oai_urls-{$id}\n; eval { harvest_to_dir($id); } ; if($@) { print LOG error at id $id: $@\n; close LOG; } } close LOG; } sub get_ineligble_archives { ## directory where the archives my $max_ago=shift; ## result, an array reference my $r; my $count; foreach my $in_dir (@in_dirs) { if(not -d $in_dir) { print LOG making $in_dir\n; mkdir $in_dir; } #foreach my $dir (`ls $format_dir`) { # ## remove newline # chomp $dir; # ## it hase to have 6-char names # my $archive_dir=$format_dir/$dir; # if($verbose) { #print LOG checking $archive_dir\n; # } if(not $in_dir=~m|/([^/]{6})$|) { next; } my $id=$1; $r-{$id}=is_eligible($in_dir,$max_ago); } return $r; } ## check for archiving time sub is_eligible { ## list xml files, but report no error if they are ## not there ## code kept as a transition my $archive_dir=shift; my $max_ago=shift; my $now=time(); if(not -d $archive_dir) { print no such dir: $archive_dir\n; } ## check if it is locked my $lock_file=$archive_dir/lock; if(-f $lock_file) { ## remove lock file if