Re: Question regarding your Plucker product

2001-11-27 Thread David A. Desrosiers


> It was a quick "hack" I did six months ago, so I guess I should clean it
> up a bit...

It'll need a quick modification when I release the new pilot-link.
The whole libpisock backend is being rewritten substantially to implement
protocol "recipes", so we can have per-device protocol separation. More on
that later.



/d





RE: Question regarding your Plucker product

2001-11-27 Thread Jon Wickstrom

Hi,

I'll just start by boosting your egos and say that Plucker is great
and it's wonderful that you have put so much time into making it
what it is today.

> Currently, when you click on a link that wasn't retrieved, you have
> the option to copy the URL to the Memo database, and you can if you
> wish add text after the URL. This feature could be left exactly as is;
> no changes to the viewer would be necessary. Instead, there could be
> an additional program that the user runs manually which would read
> the Memo database that has been hotsynced to the PC, would pull
> out all the Memo records that were saved from the Plucker viewer,
> create a temporary HTML file from them. The new program would then
> call the parser to fetch all links in that HTML file.

How about extending the parser to understand the concept of "add missing
links to document"? So that the missing links would not become a new
document, but instead appear in the original doc. This would need some
kind of functionality in the parser which (next time the document is
fetched)
would fetch a link if it exists in a "missing links" array regardles of
maxdetph, stayonhost and other constraints. Links which cannot be mapped
to a source document could be fetched to a "Missing links document". The
original source document would not have to be known, we just fetch the 
"missing link" if we stumble upon it in a page we fetch.

Problems, as I see them, are at least:
- How do we know if the missing link has been fetched? We might be
doing
  several parser runs. Maybe some way of telling the parser "fetch
all
  missing links not fetched" into a separate document?
- Right now we don't fetch at sync-time (and probably don't want
to). So
  the roundtrip would be sync - fetch - sync if I understand it
correctly?
- Somebody might always/sometimes want the missing links in a new
document
- Problems when the source document changes?
- This is just an idea, not a complete solution.

> To specify maxdepths, file names and other options, a section with a
> certain name could be added to the usual Plucker config files by the
> user. The new program would refer to that section when calling the
> parser. Also (possibly as a future enhancement) the viewer could be
> modified to add a little pull-down "maxdepth" selection list to the
> "External Link" screen: before you click on the "Copy URL" button,
> you select the maxdepth for that particular link from the list, and
> it's saved in the Memo pad entry in some format that the new program
> can read and understand.

Sounds reasonable. A field for naming the link would also be great, it
could default to the body of the  tag. And maybe a checkbox for
selecting if we want top copy the link in HTML (yyy)
or plaintext? Right now the missing link/external link form quite
verbosly informs the user about why the document is missing, so there
would be room for new fields. Then again, if all the whistles and bells
are added, new users might be confused (There probably is a reason for
the long text in the form?).

> > Then there's the --no-urlinfo complex. If it's used, you lose the
> > ability to retrieve those "out of bounds" urls.

But that is a deliberate(?) choice made by the user who created the
document?

> > Alternately, we pull the database from the Palm, run a  gather on
the
> > desktop, comparing against what is in the database we just  pulled from
the
> > Palm, and then integrate those "missing" records. However,  would you
just
> > want to append those records? Or remove the ones already  read, and then
> > replace them with the "out of bounds" records you checked?

If we merge the missing links into the original source document, then
we keep the missing link as long as we have the document. This again
conflicts with fetching the unseen links in a "missing links" document.

I would like a nice GUI for managing the links (on the handheld and on
the desktop) ;-). Managing the memos which Plucker currently creates is not
so easy, but far better than nothing. In my structured world the missing
links should go into a separate database (I believe this was debated a while
back on this list?).

What is the concept on non Windows platforms with backup databases? On
Windows the
Hotsync manager makes a copy of the database on the handheld. This could be
checked
under some conditions for what pages are included in a document. Doesn't the
Plucker
parser also keep a copy of the databases it creates? This database could be
checked
before it is replaced by a new database.


Yes I know Plucker is Open Source and that nobody gets payed for
implementing
new features (or maintaining it) and the source is in the CVS and that if I
want
something I can (or might be able to) implement it myself.


   -Jonte



Re: Question regarding your Plucker product

2001-11-26 Thread Michael Nordström

On Mon, Nov 26, 2001, Chris Hawks wrote:

> Mike wrote a program to do this. pluckerlinks???

I had completely forgot about that program ;-)

It was a quick "hack" I did six months ago, so I guess I should clean
it up a bit...

/Mike



Re: Question regarding your Plucker product

2001-11-26 Thread MJ Ray

> [...] Instead, there could be
> an additional program that the user runs manually which would read
> the Memo database that has been hotsynced to the PC, would pull
> out all the Memo records that were saved from the Plucker viewer,
> create a temporary HTML file from them. The new program would then
> call the parser to fetch all links in that HTML file.

I currently do something like this with the following command:

jpilot-dump -M | sed -e
'/^Plucker/,/^$/{;s/^Plucker.*$/&<\/h4>/;s/^.*:\/\/.*$/&<\/a>/;};/^[^<]/d' >home.html

(remove the line breaks), but it would be very good to have the Plucker
viewer write a bit more home.html-format-like links to the memo database
instead of just the URLs, then I could just replace this with:

jpilot-dump -M | sed -e
'/^Plucker/,/^$/{;s/^Plucker.*$/&<\/h4>/;};/^[^<]/d' >home.html

and get more functionality to boot.  I quite understand that providing the
full range of plucker options on the copy url screen isn't viable. Is there
another way to read the memo db?
-- 
MJR



Re: Question regarding your Plucker product

2001-11-26 Thread Chris Hawks

---Reply to mail from Alys about Question regarding your Plucker product

> How about this for what might be a simpler way of doing more or less
> what Bostjan requests:
> 
> Currently, when you click on a link that wasn't retrieved, you have
> the option to copy the URL to the Memo database, and you can if you
> wish add text after the URL. This feature could be left exactly as is;
> no changes to the viewer would be necessary. Instead, there could be
> an additional program that the user runs manually which would read
> the Memo database that has been hotsynced to the PC, would pull
> out all the Memo records that were saved from the Plucker viewer,
> create a temporary HTML file from them. The new program would then
> call the parser to fetch all links in that HTML file.

Mike wrote a program to do this. pluckerlinks???

---End reply

Christopher R. Hawks
HAWKSoft
-
"If you have trouble sounding condescending, find a Unix user to show you
how it's done.
-- Scott Adams








Re: Question regarding your Plucker product

2001-11-26 Thread Alys

How about this for what might be a simpler way of doing more or less
what Bostjan requests:

Currently, when you click on a link that wasn't retrieved, you have
the option to copy the URL to the Memo database, and you can if you
wish add text after the URL. This feature could be left exactly as is;
no changes to the viewer would be necessary. Instead, there could be
an additional program that the user runs manually which would read
the Memo database that has been hotsynced to the PC, would pull
out all the Memo records that were saved from the Plucker viewer,
create a temporary HTML file from them. The new program would then
call the parser to fetch all links in that HTML file.

This might be the fastest way of getting the suggested functionality,
although it would require that the user runs the new program
themselves (or sets a cron job). However that could be marketed as a
Feature because it gives them full control over when the missing links
are retrieved. :)

To specify maxdepths, file names and other options, a section with a
certain name could be added to the usual Plucker config files by the
user. The new program would refer to that section when calling the
parser. Also (possibly as a future enhancement) the viewer could be
modified to add a little pull-down "maxdepth" selection list to the
"External Link" screen: before you click on the "Copy URL" button,
you select the maxdepth for that particular link from the list, and
it's saved in the Memo pad entry in some format that the new program
can read and understand.


I currently do something vaguely like this myself: I use pilot-link to
download the Memo database in the form of emails, manually view
the memo email folder in Mutt, filter for "Plucker URLs" items,
and send those filtered items into a very quick and dirty Perl
program which pulls out the URLs (and optional text that I might
have added) and creates an HTML file (or appends them to the file if
it already exists). A cron job then runs the normal Plucker parser
on that HTML file once a day, using the default maxdepth of 2.  It
works very well for me so I think that a program to automate the
pilot-link, Mutt and Perl steps would be useful for others.

Alys

--
Alice Harris
Internet Services / ESD Operations, CITEC
[EMAIL PROTECTED] [EMAIL PROTECTED]




On Mon, Nov 26, 2001 at 11:25:19AM -0800, David A. Desrosiers wrote:
> 
> > Example: I'm reading the downloaded news and I click on a link that was
> > not downloaded. I select this link (via checkbox) I name the link i.e.
> > "Link that was not downloaded the last time" and I select the depth of
> > gathering the information. When I come to another non-downloaded link, I
> > repeat the process.
> 
>   In order to do this, we need to actually store the string of
> characters which make up the "out of bounds" URLs which were not fetched.
> For a very large fetch, or a site which contained a lot of links, this could
> be a considerable size.
> 
>   Then there's the --no-urlinfo complex. If it's used, you lose the
> ability to retrieve those "out of bounds" urls.
> 
> > All that while I'm using my Palm. And when I HotSync the pda, the
> > Plucker downloads the newly made HTML page (i.e. "Extra pages that have
> > to be downloaded") and uses it in its next session. Would that be
> > possible? Is there any way you could implement this, while not needing
> > to rewrite the whole program all over again? :)
> 
>   And this brings up another issue, which is that we don't currently
> touch (update) the databases on the Palm with the parser on the desktop. In
> order to do this, we would now require that the Palm be in the cradle at
> *GATHER* time, or we'd have to cache off the Plucker databases on the
> desktop. Both are not ideal, and would require a lot more space on the Palm,
> assuming we use the Palm to add records.
> 
>   Alternately, we pull the database from the Palm, run a gather on the
> desktop, comparing against what is in the database we just pulled from the
> Palm, and then integrate those "missing" records. However, would you just
> want to append those records? Or remove the ones already read, and then
> replace them with the "out of bounds" records you checked?
> 
>   In order to do this, you need parser and viewer changes which
> slightly change the architecture a bit, by adding a 360-degree sync
> capability. The original Plucker implementation did this, with a local cache
> directory and then actually created the PDB on the Palm, using the desktop
> conduit, vs. the Python parser today which creates the PDB on the desktop,
> which you then sync to your Palm with your desktop tools.
> 
> 
> 
> /d
> 



Re: Question regarding your Plucker product

2001-11-26 Thread David A. Desrosiers


> Example: I'm reading the downloaded news and I click on a link that was
> not downloaded. I select this link (via checkbox) I name the link i.e.
> "Link that was not downloaded the last time" and I select the depth of
> gathering the information. When I come to another non-downloaded link, I
> repeat the process.

In order to do this, we need to actually store the string of
characters which make up the "out of bounds" URLs which were not fetched.
For a very large fetch, or a site which contained a lot of links, this could
be a considerable size.

Then there's the --no-urlinfo complex. If it's used, you lose the
ability to retrieve those "out of bounds" urls.

> All that while I'm using my Palm. And when I HotSync the pda, the
> Plucker downloads the newly made HTML page (i.e. "Extra pages that have
> to be downloaded") and uses it in its next session. Would that be
> possible? Is there any way you could implement this, while not needing
> to rewrite the whole program all over again? :)

And this brings up another issue, which is that we don't currently
touch (update) the databases on the Palm with the parser on the desktop. In
order to do this, we would now require that the Palm be in the cradle at
*GATHER* time, or we'd have to cache off the Plucker databases on the
desktop. Both are not ideal, and would require a lot more space on the Palm,
assuming we use the Palm to add records.

Alternately, we pull the database from the Palm, run a gather on the
desktop, comparing against what is in the database we just pulled from the
Palm, and then integrate those "missing" records. However, would you just
want to append those records? Or remove the ones already read, and then
replace them with the "out of bounds" records you checked?

In order to do this, you need parser and viewer changes which
slightly change the architecture a bit, by adding a 360-degree sync
capability. The original Plucker implementation did this, with a local cache
directory and then actually created the PDB on the Palm, using the desktop
conduit, vs. the Python parser today which creates the PDB on the desktop,
which you then sync to your Palm with your desktop tools.



/d