Re: [Glamtools] GW Toolset problems

2016-02-05 Thread bawolff
On Fri, Feb 5, 2016 at 5:10 AM, Martin Poulter
 wrote:
> I can supply a control condition. The Bodleian Library URLs, which work 
> without a problem, have the similar form:
> http://iiif.bodleian.ox.ac.uk/iiif/image/e998e3d6-17e2-40ca-bf23-bc4278feb198/full/!1000,1000/0/default.jpg
> Note that there is a comma but no dot before the filename. This suggests that 
> commas are not the problem.
>

This file sends the correct image/jpeg mime type:

bawolff@tools-bastion-01:~$ HEAD
'http://iiif.bodleian.ox.ac.uk/iiif/image/e998e3d6-17e2-40ca-bf23-bc4278feb198/full/!1000,1000/0/default.jpg'
200 OK
Cache-Control: max-age=86400
Connection: close
Date: Tue, 02 Feb 2016 16:02:04 GMT
Via: 1.1 varnish-v4
Age: 0
Server: iipsrv/1.0
Content-Length: 97569
Content-Type: image/jpeg
Last-Modified: Mon, 01 Sep 2014 18:01:07 GMT
Access-Control-Allow-Origin: *
Client-Date: Fri, 05 Feb 2016 11:31:00 GMT
Client-Peer: 163.1.141.208:80
Client-Response-Num: 1
Content-Disposition: inline;filename="full_lossless.jpg"
X-Cache: HIT
X-Cache-Hits: 4
X-Varnish: 12281155 11378260


Note the Content-Type: image/jpeg line.

--
-bawolff

___
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools


Re: [Glamtools] GW Toolset problems

2016-02-04 Thread bawolff
On Thu, Feb 4, 2016 at 2:44 PM, Jean-Frédéric
 wrote:
>
>
> 2016-02-04 19:31 GMT+00:00 Brian Wolff :
>>
>> I think its more likely gwtoolset looks at the mime type set in the
>> Content-type header set by the webserver.
>
>
> (Disclaimer: I checked out the GWToolset for the first time today).
>
> According to the documentation of getFileExtension in
> includes/Handlers/UploadHandler.php (assuming I’m looking a the right
> place); it is doing both:
>>>
>>> attempts to get the file extension of a media file url using the
>>>
>>> $options provided. it will first look for a valid file extension in the
>>>
>>> url; if none is found it will fallback to an appropriate file extention
>>>
>>> based on the content-type
>

I just checked this file. The webserver is indeed misconfigured, and
returning a mime type of image/jpg (Correct mime
type is image/jpeg).

I believe that code you're referencing is used to determine the
extension to give the image when uploading it. e.g. jpg files are
allowed to have the extension jpg, jpeg or even jpe. So it uses the
url to decide between which of those three alternatives to use, in the
case the mime type sent is image/jpeg. But if a different mime type is
sent, then it won't consider .jpg to be a valid extension for that
type.

The code comment itself is a bit misleading. If you look at the actual
code [Simplifying things to make it clearer]
$result = null;
...
$pathinfo['extension'] = 
 if (  in_array( $pathinfo['extension'], $wgFileExtensions )
   && strpos( $MimeMagic->getTypesForExtension( $pathinfo['extension'] ),
$options['content-type']
) !== false
) {
// So, if the extension from url is in $wgFileExtensions (Allowed
upload extensions)
// And, when we get the list of possible mime types for that
extension, one of them matches what was sent by the server
// then use the extension from the url
$result = $pathinfo['extension'];
} elseif ( !empty( $options['content-type'] ) ) {
// Otherwise, just use the default extension for this mime type.
$result = explode( ' ',
$MimeMagic->getExtensionsForType( $options['content-type'] ) );

if ( !empty( $result ) ) {
$result = $result[0];
}
}


So what happens in this case. Image has mime type of image/jpg, which
is not a real mime type. First it tries to use the extension from the
webserver (.jpg), but when it checks if .jpg is a valid extension for
that mime type, it determines no it isn't (Since there is no valid
extensions for non-existent mime types), so it discards that option.
Then it tries to use the default extension for the given mime type,
but again fails since there are no default extensions for the
non-existent mime type. And thus the check overall fails.

--
-bawolff

___
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools


Re: [Glamtools] GW Toolset problems

2016-02-04 Thread Sébastien Santoro
Not only the site is whitelisted, but I've manually tested upload by
URL using [[Special:Upload]] ant it works.

The URL contains two non alphanumeric elements in path : "2.0" and
"1000," so I'd guess it's where to find an explanation

Hypothesis 1: GWT considers the extension is
".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"

Hypothesis 2: GWT stop to parse the URL before the comma and want to
handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,
where there isn't any extension and is 404.

The dot in URL is rather frequent, the comma less, so the second
hypothesis is more plausible.


On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans  wrote:
> Hello everyone, I am trying to do a batch upload using the GW Toolset but the 
> tool will not recognize the file extension from the Image URL's provided. 
> Here is an example of a url from the XML file:
>
> http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
>
> Any ideas as to why this doesn't work? (site is white-listed ect)
>
> Thanks
>
> Jason
> --
> Jason Evans
> Wicipediwr Preswyl / Wikipedian in Residence
> Llyfrgell Genedlaethol Cymru / National Library of Wales
> jason.ev...@llgc.org.uk
> Ffon/Tel: +44 (0)1970 632405
>
> ___
> Glamtools mailing list
> Glamtools@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/glamtools



-- 
Sébastien Santoro aka Dereckson
http://www.dereckson.be/

___
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools


Re: [Glamtools] GW Toolset problems

2016-02-04 Thread Jason J. Evans
That's very helpful, thank you very much. 
 
 
On Thursday, February 4, 2016 17:14 GMT, Sébastien Santoro 
 wrote: 
 
> Not only the site is whitelisted, but I've manually tested upload by

> URL using [[Special:Upload]] ant it works.
> 
> The URL contains two non alphanumeric elements in path : "2.0" and
> "1000," so I'd guess it's where to find an explanation
> 
> Hypothesis 1: GWT considers the extension is
> ".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
> 
> Hypothesis 2: GWT stop to parse the URL before the comma and want to

> handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,
> where there isn't any extension and is 404.
> 
> The dot in URL is rather frequent, the comma less, so the second
> hypothesis is more plausible.
> 
> 
> On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans  
> wrote:
> > Hello everyone, I am trying to do a batch upload using the GW Toolset but 
> > the tool will not recognize the file extension from the Image URL's 
> > provided. Here is an example of a url from the XML file:
> >
> > http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
> >
> > Any ideas as to why this doesn't work? (site is white-listed ect)
> >
> > Thanks
> >
> > Jason
> > --
> > Jason Evans
> > Wicipediwr Preswyl / Wikipedian in Residence
> > Llyfrgell Genedlaethol Cymru / National Library of Wales
> > jason.ev...@llgc.org.uk
> > Ffon/Tel: +44 (0)1970 632405
> >
> > ___
> > Glamtools mailing list
> > Glamtools@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/glamtools
> 
> 
> 
> -- 
> Sébastien Santoro aka Dereckson
> http://www.dereckson.be/
> 
> ___
> Glamtools mailing list
> Glamtools@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/glamtools
 
-- 
Jason Evans
Wicipediwr Preswyl / Wikipedian in Residence
Llyfrgell Genedlaethol Cymru / National Library of Wales
jason.ev...@llgc.org.uk
Ffon/Tel: +44 (0)1970 632405
 

___
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools


Re: [Glamtools] GW Toolset problems

2016-02-04 Thread Jean-Frédéric
Thanks for the report Jason, I filed this on Phabricator : <
https://phabricator.wikimedia.org/T125846>

2016-02-04 17:18 GMT+00:00 Jason J. Evans :

> That's very helpful, thank you very much.
>
>
> On Thursday, February 4, 2016 17:14 GMT, Sébastien Santoro <
> dereck...@espace-win.org> wrote:
>
> > Not only the site is whitelisted, but I've manually tested upload by
>
> > URL using [[Special:Upload]] ant it works.
> >
> > The URL contains two non alphanumeric elements in path : "2.0" and
> > "1000," so I'd guess it's where to find an explanation
> >
> > Hypothesis 1: GWT considers the extension is
> > ".0/image/1293358/full/1000,/0/default.jpg", which is not ".jpg"
> >
> > Hypothesis 2: GWT stop to parse the URL before the comma and want to
>
> > handle http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,
> > where there isn't any extension and is 404.
> >
> > The dot in URL is rather frequent, the comma less, so the second
> > hypothesis is more plausible.
> >
> >
> > On Thu, Feb 4, 2016 at 6:08 PM, Jason J. Evans 
> wrote:
> > > Hello everyone, I am trying to do a batch upload using the GW Toolset
> but the tool will not recognize the file extension from the Image URL's
> provided. Here is an example of a url from the XML file:
> > >
> > >
> http://dams.llgc.org.uk/iiif/test/2.0/image/1293358/full/1000,/0/default.jpg
> > >
> > > Any ideas as to why this doesn't work? (site is white-listed ect)
> > >
> > > Thanks
> > >
> > > Jason
> > > --
> > > Jason Evans
> > > Wicipediwr Preswyl / Wikipedian in Residence
> > > Llyfrgell Genedlaethol Cymru / National Library of Wales
> > > jason.ev...@llgc.org.uk
> > > Ffon/Tel: +44 (0)1970 632405
> > >
> > > ___
> > > Glamtools mailing list
> > > Glamtools@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/glamtools
> >
> >
> >
> > --
> > Sébastien Santoro aka Dereckson
> > http://www.dereckson.be/
> >
> > ___
> > Glamtools mailing list
> > Glamtools@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/glamtools
>
> --
> Jason Evans
> Wicipediwr Preswyl / Wikipedian in Residence
> Llyfrgell Genedlaethol Cymru / National Library of Wales
> jason.ev...@llgc.org.uk
> Ffon/Tel: +44 (0)1970 632405
>
>
> ___
> Glamtools mailing list
> Glamtools@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/glamtools
>



-- 
Jean-Frédéric
___
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools


Re: [Glamtools] GW Toolset problems

2016-02-04 Thread Jean-Frédéric
2016-02-04 19:31 GMT+00:00 Brian Wolff :

> I think its more likely gwtoolset looks at the mime type set in the
> Content-type header set by the webserver.
>

(Disclaimer: I checked out the GWToolset for the first time today).

According to the documentation of *getFileExtension *in
*includes/Handlers/UploadHandler.php* (assuming I’m looking a the right
place); it is doing both:

> attempts to get the file extension of a media file url using the
>
> $options provided. it will first look for a valid file extension in the
>
> url; if none is found it will fallback to an appropriate file extention
>
> based on the content-type
>
>
-- 
Jean-Frédéric
___
Glamtools mailing list
Glamtools@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/glamtools