Re: getting changes in (and UTF-8)

2003-05-31 Thread MJ Ray
Alexander R. Pruss <[EMAIL PROTECTED]> wrote: > 1. I myself don't need full UTF-8. I just need to make some quick-and-dirty > substitutions for quote marks, apostrophes and long dashes. So if someone Look for unknown_charref in TextParser.py -- I wonder if this can be done in a more general way

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Bill Janssen
> Yes, but it still won't give what iSiloX can give, since it's only going > to give the two files in this example. What if one wants (and I must > confess that this is not something I actually needed) to get two files, > and, say, everything these two files link to, at depth 1 relative to each?

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Alexander R. Pruss
On Thu, 29 May 2003, Laurens M. Fridael wrote: > What Alexander is saying (I think, correct me if I'm wrong) is that the > spider should be able to maintain multiple root points from which to > calculate link depth. Only one of the root points becomes the home page. > However, this goes against the

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Laurens M. Fridael
David A. Desrosiers wrote: > Not at all. This is why the home.html construct works (and it's the > foundation that Plucker was originally based upon). You have a > home.html, which resembles the following (excuse the horrible ascii > art): > >[home.html] >/|\

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread David A. Desrosiers
> However, this goes against the Plucker format, which dicates a > hierarchical, rather than a linear, structure. I.e. there is just > one"home" record and through that you branch out to the other pages. Not at all. This is why the home.html construct works (and it's the foundation that P

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Laurens M. Fridael
Michael Nordstrom wrote: > On Thu, May 29, 2003, Alexander R. Pruss wrote: >> So, perhaps the solution is to add an option to the distiller to >> start the spidering from one file (e.g., the custom HTML file >> pointing to the links one wants) and to include another file as the >> home page in the

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread David A. Desrosiers
> I don't think similar functionality exists in Plucker. We are not iSilo. We don't try to be like them. d. ___ plucker-dev mailing list [EMAIL PROTECTED] http://lists.rubberchicken.org/mailman/listinfo/plucker-dev

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Alan Hoyle
On Thu, 29 May 2003, Michael Nordstrom wrote: > On Thu, May 29, 2003, Alexander R. Pruss wrote: > > I suppose this can be done by just making a custom HTML file pointing to > > these two files, and then setting depth to 2. The problem then is that > > the custom HTML file will, I assume, show up

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Michael Nordstrom
On Thu, May 29, 2003, Alexander R. Pruss wrote: > I suppose this can be done by just making a custom HTML file pointing to > these two files, and then setting depth to 2. The problem then is that > the custom HTML file will, I assume, show up as the home page when one > loads in the file. Well,

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Alexander R. Pruss
On Thu, 29 May 2003, Michael Nordstrom wrote: > On Thu, May 29, 2003, Alexander R. Pruss wrote: > > Thus, suppose I want to get only two URLs from my > > home page: index.html and cv.html . Now, index.html links to cv.html and > > to many other things. Ideally, I could have an include list of two

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-30 Thread Michael Nordstrom
On Thu, May 29, 2003, Alexander R. Pruss wrote: > Thus, suppose I want to get only two URLs from my > home page: index.html and cv.html . Now, index.html links to cv.html and > to many other things. Ideally, I could have an include list of two URLs, > index.html and cv.html, and then set the spid

Re: Exclude/include links (was: getting changes in (and UTF-8))

2003-05-29 Thread Alexander R. Pruss
On Thu, 29 May 2003, Michael Nordstrom wrote: > Plucker's exclusion list format allows you to both exclude and > include links, > > http://docs.plkr.org/node55.html It doesn't quite do what I was after. In iSiloX I can specify a multiple list of URLs that the fetching starts from, while Pluc

Exclude/include links (was: getting changes in (and UTF-8))

2003-05-29 Thread Michael Nordstrom
On Thu, May 29, 2003, Alexander R. Pruss wrote: > iSiloX lets you have not just an exclusion list but an inclusion > list of URLs to be fetched. Plucker's exclusion list format allows you to both exclude and include links, http://docs.plkr.org/node55.html Plucker-desktop can also handles t

Re: getting changes in (and UTF-8)

2003-05-29 Thread Alexander R. Pruss
1. I myself don't need full UTF-8. I just need to make some quick-and-dirty substitutions for quote marks, apostrophes and long dashes. So if someone can tell me where the input text is available so I can call string.replace(), I would be most grateful. 2. As for the correct order of records, I

Re: getting changes in (and UTF-8)

2003-05-29 Thread Bill Janssen
I've been somewhat derelict in my duty the last few months, since things have heated up at my work. But I'm due to finish a paper and a talk by the 10th of June, and plan to get back to Plucker work again. If you can wait that long, please post the fix on the bug tracker, and I'll work through th