Hi Peter,

The most obvious issue would be if www.olympics.sk/index.php was not 
(correctly) harvested.

Probably best to look into the WARC files and extract the response record for 
both URLs.  Probably worth checking how their CDX lines look as well.

Assuming you're using gzipped WARCs, you can bash utility 'zcat' to look into 
the WARC. I find it best to pipe the output into 'less' and then uses less's 
search capabilities.

Best,
Kris

-------------------------------------------------------------------------
Landsbókasafn Íslands - Háskólabókasafn | Arngrímsgötu 3 - 107 Reykjavík
Sími/Tel: +354 5255600 | www.landsbokasafn.is
-------------------------------------------------------------------------
fyrirvari/disclaimer - http://fyrirvari.landsbokasafn.is
> -----Original Message-----
> From: [email protected] [mailto:openwayback-
> [email protected]] On Behalf Of [email protected]
> Sent: 31. ágúst 2016 09:14
> To: openwayback-dev
> Subject: [openwayback-dev] OW redirects from example.com to
> example.com/index.php and back again
>
>
>       Hi,
>
>           I would like to ask you about a problem with OW redirections. We 
> tried
> to harvest (Heritrix) the page www.olympics.sk , and in OW the page keeps
> redirect itself from www.olympic.sk to www.olympics.sk/index.php . No content
> is shown, only a blank space. Harvest reached the set volume of 2 GB harvested
> data per this domain.
>
>       Thank you for any advice, best regards,
>
>       Peter,
>       Slovak webarchive,
>       www.webdepozit.sk
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "openwayback-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"openwayback-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to