Re: RO full save and images in CSS

2011-02-01 Thread Rob Kendrick
On Sat, Jan 29, 2011 at 03:42:54AM +, Harriet Bazley wrote:
> I don't think that's actually possible:   WebsterXL used to try to do
> just this, and the result was that you got all sorts of directories with
> random image files in appearing *above* the 'full save' you thought
> you'd created.   Unfortunately not all elements of a web page are
> necessarily loaded from levels subsidiary to the actual HTML file.

It's worse than that; any given URL might return different data on each
request.  (Think about what happens if a "counter image" CGI is included
twice.)

B.



Re: RO full save and images in CSS

2011-02-01 Thread Rob Kendrick
On Sun, Jan 30, 2011 at 12:37:38PM +, Martin Bazley wrote:
> The following bytes were arranged on 29 Jan 2011 by Harriet Bazley :
> 
> > Unfortunately not all elements of a web page are necessarily loaded
> > from levels subsidiary to the actual HTML file.
> 
> Well, yeah, duh.
> 
> The structure I'd find useful would be something like this:
> 
> !Appname
> !Appname.www/example/org/uk
> !Appname.www/example/org/uk.index/html
> !Appname.www/example/org/uk.images.pic1a/gif
> !Appname.www/example/org/uk.thumbs.pic1a/gif
> !Appname.www/another/site/com.public.pics.big.screen/jpg
> !Appname.!Run (Filer_Run .www/example/org/uk.index/html)

The reason that it is not this is to maintain compability with file
systems that only support 10 characters.  We already decided at the hack
weekend that we couldn't be bothered to support these systems any more
(after all, we don't support anything older than RISC OS 4.02 anyway.)

The idea we actually had was to delegate filenaming for this to the
front ends, so they can deal with their local OS's own bizarre file
system conventions.  Under UNIX, and possibly RISC OS, this would
probably contain enough information for a human to identify the file's
source, as well as a unique ID.

> That way, filename conversion would simply be a matter of replacing
> "http://"; with "file:", and relative URLs wouldn't need
> changing at all.  

It requires more thought than this, especially given it has to be
portable.

> All the same, I get the distinct impression that the only way that would
> ever happen would be if I implemented it myself.  

This is likely.

> (Which I would, if
> NetSurf wasn't written in this mysterious indecipherable language called
> 'C'...)

It clearly isn't indecipherable, given millions of people are able to
read and write it quite happily :) 

If it were written in BASIC, you'd have something approximately as
useful as WebsterXL.  ie, not at all.

B.



Re: RO full save and images in CSS

2011-01-30 Thread Martin Bazley
The following bytes were arranged on 29 Jan 2011 by Harriet Bazley :

> Unfortunately not all elements of a web page are necessarily loaded
> from levels subsidiary to the actual HTML file.

Well, yeah, duh.

The structure I'd find useful would be something like this:

!Appname
!Appname.www/example/org/uk
!Appname.www/example/org/uk.index/html
!Appname.www/example/org/uk.images.pic1a/gif
!Appname.www/example/org/uk.thumbs.pic1a/gif
!Appname.www/another/site/com.public.pics.big.screen/jpg
!Appname.!Run (Filer_Run .www/example/org/uk.index/html)

That way, filename conversion would simply be a matter of replacing
"http://"; with "file:", and relative URLs wouldn't need
changing at all.  This would have the advantage that relative URLs in
places not currently supported (e.g. CSS) could be much more easily
manually fixed, by simply downloading the pictures and placing them in
the correct directory.

I don't find NetSurf's current structure very useful at all - not just
because it makes it impossible to find anything, but because it makes it
almost impossible to correct it when things go wrong, such as the
problem mentioned in this thread!

Another benefit of the above structure would be that you could full-save
two different web pages on top of the same application.  (To this end,
an option to automatically convert URLs to the local structure,
regardless of whether it meets the criteria for downloading or not,
would be helpful.)

All the same, I get the distinct impression that the only way that would
ever happen would be if I implemented it myself.  (Which I would, if
NetSurf wasn't written in this mysterious indecipherable language called
'C'...)

-- 
  __<^>__   Follow me on Twitter! --> http://twitter.com/swirlythingy
 / _   _ \  (Or, um, don't.  It's a free country and all that.)
( ( |_| ) )
 \_>   <_/  === Martin Bazley ==




Re: RO full save and images in CSS

2011-01-30 Thread Richard Porter
On 29 Jan 2011 Harriet Bazley  wrote:

> On 25 Jan 2011 as I do recall,
>   Martin Bazley  wrote:

>> (The full save code really needs rewriting anyway, to organise things in
>> a directory structure with original leafnames intact mimicking the
>> structure of an actual website, simultaneously making it more
>> user-friendly to browse and easier to transcode URLs for
>>
> [snip]

> I don't think that's actually possible:   WebsterXL used to try to do
> just this, and the result was that you got all sorts of directories with
> random image files in appearing *above* the 'full save' you thought
> you'd created.   Unfortunately not all elements of a web page are
> necessarily loaded from levels subsidiary to the actual HTML file.

Not only that, but you might have objects with the same leafnames in 
different directories so you would still need to ensure unique 
filenames. It might help to add the usual filename extensions even 
though they're not needed.

> I find Netsurf's approach - to rewrite the whole thing into a RISC OS
> application structure and enclose an Inventory file listing the original
> sources/names of the files - to be much more useful in practice, and
> more elegant.

Agreed.

-- 
Richard Porterhttp://www.minijem.plus.com/
  mailto:r...@minijem.plus.com
I don't want a "user experience" - I just want stuff that works.



Re: RO full save and images in CSS

2011-01-30 Thread Harriet Bazley
On 25 Jan 2011 as I do recall,
  Martin Bazley  wrote:

> (The full save code really needs rewriting anyway, to organise things in
> a directory structure with original leafnames intact mimicking the
> structure of an actual website, simultaneously making it more
> user-friendly to browse and easier to transcode URLs for
>
[snip]

I don't think that's actually possible:   WebsterXL used to try to do
just this, and the result was that you got all sorts of directories with
random image files in appearing *above* the 'full save' you thought
you'd created.   Unfortunately not all elements of a web page are
necessarily loaded from levels subsidiary to the actual HTML file.

I find Netsurf's approach - to rewrite the whole thing into a RISC OS
application structure and enclose an Inventory file listing the original
sources/names of the files - to be much more useful in practice, and
more elegant.

-- 
Harriet Bazley ==  Loyaulte me lie ==

The nice thing about standards is that there are so many to choose from




Re: RO full save and images in CSS

2011-01-27 Thread Martin Bazley
The following bytes were arranged on 27 Jan 2011 by Erving :

> From:  Martin Bazley 
> Date:  25 Jan 2011
>
> > For a good example, try full-saving http://www.beano.com/ .
>
> I just repeatedly get 'Connection time-out' (Risc PC, RISC OS 4.02
> Netsurf 2.6) though Firefox on an ancient widows laptop displays the
> page in seconds.

Yes, it does that sometimes; you just have to keep refreshing until it
works.

-- 
  __<^>__   Follow me on Twitter! --> http://twitter.com/swirlythingy
 / _   _ \  (Or, um, don't.  It's a free country and all that.)
( ( |_| ) )
 \_>   <_/  === Martin Bazley ==




Re: RO full save and images in CSS

2011-01-26 Thread Erving
From:  Martin Bazley 
Date:  25 Jan 2011

snip
> 
> For a good example, try full-saving http://www.beano.com/ .
> 
snip


I just repeatedly get 'Connection time-out' (Risc PC, RISC OS 4.02 
Netsurf 2.6) though Firefox on an ancient widows laptop displays the 
page in seconds.

--

Erving
 



Re: RO full save and images in CSS

2011-01-25 Thread Richard Porter
On 25 Jan 2011 Martin Bazley  wrote:

> I see from the progress page that this is a known issue, but I just want
> to request its implementation.

> The RISC OS full save, while saving all images referred to in HTML,
> doesn't pay any attention to images referred to in CSS, enclosed in
> url() brackets.  They are neither downloaded nor renamed.

I thought that a full save did in fact save all the objects that were 
downloaded (i.e. before NS ran out of memory, if it did) but that 
those referenced in stylesheets weren't accessible by clicking Menu on 
them.

Richard
-- 
Richard Porterhttp://www.minijem.plus.com/
  mailto:r...@minijem.plus.com
I don't want a "user experience" - I just want stuff that works.



Re: RO full save and images in CSS

2011-01-25 Thread Rob Kendrick
On Tue, Jan 25, 2011 at 03:00:02PM +, Martin Bazley wrote:
> 
> (The full save code really needs rewriting anyway, to organise things in
> a directory structure with original leafnames intact mimicking the
> structure of an actual website, simultaneously making it more
> user-friendly to browse and easier to transcode URLs for, but that won't
> happen in the near future and this is a more important stop-gap.)

The need to rewrite the full save functionality was discussed at our
last hack weekend.  Apart from the issue you describe, the other obvious
one is that it is very RISC OS-specific.

The only blocker on doing this is nobody wanting to spend the time to do it.

B.