Re: Question regarding your Plucker product
> It was a quick "hack" I did six months ago, so I guess I should clean it > up a bit... It'll need a quick modification when I release the new pilot-link. The whole libpisock backend is being rewritten substantially to implement protocol "recipes", so we can have per-device protocol separation. More on that later. /d
RE: Question regarding your Plucker product
Hi, I'll just start by boosting your egos and say that Plucker is great and it's wonderful that you have put so much time into making it what it is today. > Currently, when you click on a link that wasn't retrieved, you have > the option to copy the URL to the Memo database, and you can if you > wish add text after the URL. This feature could be left exactly as is; > no changes to the viewer would be necessary. Instead, there could be > an additional program that the user runs manually which would read > the Memo database that has been hotsynced to the PC, would pull > out all the Memo records that were saved from the Plucker viewer, > create a temporary HTML file from them. The new program would then > call the parser to fetch all links in that HTML file. How about extending the parser to understand the concept of "add missing links to document"? So that the missing links would not become a new document, but instead appear in the original doc. This would need some kind of functionality in the parser which (next time the document is fetched) would fetch a link if it exists in a "missing links" array regardles of maxdetph, stayonhost and other constraints. Links which cannot be mapped to a source document could be fetched to a "Missing links document". The original source document would not have to be known, we just fetch the "missing link" if we stumble upon it in a page we fetch. Problems, as I see them, are at least: - How do we know if the missing link has been fetched? We might be doing several parser runs. Maybe some way of telling the parser "fetch all missing links not fetched" into a separate document? - Right now we don't fetch at sync-time (and probably don't want to). So the roundtrip would be sync - fetch - sync if I understand it correctly? - Somebody might always/sometimes want the missing links in a new document - Problems when the source document changes? - This is just an idea, not a complete solution. > To specify maxdepths, file names and other options, a section with a > certain name could be added to the usual Plucker config files by the > user. The new program would refer to that section when calling the > parser. Also (possibly as a future enhancement) the viewer could be > modified to add a little pull-down "maxdepth" selection list to the > "External Link" screen: before you click on the "Copy URL" button, > you select the maxdepth for that particular link from the list, and > it's saved in the Memo pad entry in some format that the new program > can read and understand. Sounds reasonable. A field for naming the link would also be great, it could default to the body of the tag. And maybe a checkbox for selecting if we want top copy the link in HTML (yyy) or plaintext? Right now the missing link/external link form quite verbosly informs the user about why the document is missing, so there would be room for new fields. Then again, if all the whistles and bells are added, new users might be confused (There probably is a reason for the long text in the form?). > > Then there's the --no-urlinfo complex. If it's used, you lose the > > ability to retrieve those "out of bounds" urls. But that is a deliberate(?) choice made by the user who created the document? > > Alternately, we pull the database from the Palm, run a gather on the > > desktop, comparing against what is in the database we just pulled from the > > Palm, and then integrate those "missing" records. However, would you just > > want to append those records? Or remove the ones already read, and then > > replace them with the "out of bounds" records you checked? If we merge the missing links into the original source document, then we keep the missing link as long as we have the document. This again conflicts with fetching the unseen links in a "missing links" document. I would like a nice GUI for managing the links (on the handheld and on the desktop) ;-). Managing the memos which Plucker currently creates is not so easy, but far better than nothing. In my structured world the missing links should go into a separate database (I believe this was debated a while back on this list?). What is the concept on non Windows platforms with backup databases? On Windows the Hotsync manager makes a copy of the database on the handheld. This could be checked under some conditions for what pages are included in a document. Doesn't the Plucker parser also keep a copy of the databases it creates? This database could be checked before it is replaced by a new database. Yes I know Plucker is Open Source and that nobody gets payed for implementing new features (or maintaining it) and the source is in the CVS and that if I want something I can (or might be able to) implement it myself. -Jonte
Re: Question regarding your Plucker product
On Mon, Nov 26, 2001, Chris Hawks wrote: > Mike wrote a program to do this. pluckerlinks??? I had completely forgot about that program ;-) It was a quick "hack" I did six months ago, so I guess I should clean it up a bit... /Mike
Re: Question regarding your Plucker product
> [...] Instead, there could be > an additional program that the user runs manually which would read > the Memo database that has been hotsynced to the PC, would pull > out all the Memo records that were saved from the Plucker viewer, > create a temporary HTML file from them. The new program would then > call the parser to fetch all links in that HTML file. I currently do something like this with the following command: jpilot-dump -M | sed -e '/^Plucker/,/^$/{;s/^Plucker.*$/&<\/h4>/;s/^.*:\/\/.*$/&<\/a>/;};/^[^<]/d' >home.html (remove the line breaks), but it would be very good to have the Plucker viewer write a bit more home.html-format-like links to the memo database instead of just the URLs, then I could just replace this with: jpilot-dump -M | sed -e '/^Plucker/,/^$/{;s/^Plucker.*$/&<\/h4>/;};/^[^<]/d' >home.html and get more functionality to boot. I quite understand that providing the full range of plucker options on the copy url screen isn't viable. Is there another way to read the memo db? -- MJR
Re: Question regarding your Plucker product
---Reply to mail from Alys about Question regarding your Plucker product > How about this for what might be a simpler way of doing more or less > what Bostjan requests: > > Currently, when you click on a link that wasn't retrieved, you have > the option to copy the URL to the Memo database, and you can if you > wish add text after the URL. This feature could be left exactly as is; > no changes to the viewer would be necessary. Instead, there could be > an additional program that the user runs manually which would read > the Memo database that has been hotsynced to the PC, would pull > out all the Memo records that were saved from the Plucker viewer, > create a temporary HTML file from them. The new program would then > call the parser to fetch all links in that HTML file. Mike wrote a program to do this. pluckerlinks??? ---End reply Christopher R. Hawks HAWKSoft - "If you have trouble sounding condescending, find a Unix user to show you how it's done. -- Scott Adams
Re: Question regarding your Plucker product
How about this for what might be a simpler way of doing more or less what Bostjan requests: Currently, when you click on a link that wasn't retrieved, you have the option to copy the URL to the Memo database, and you can if you wish add text after the URL. This feature could be left exactly as is; no changes to the viewer would be necessary. Instead, there could be an additional program that the user runs manually which would read the Memo database that has been hotsynced to the PC, would pull out all the Memo records that were saved from the Plucker viewer, create a temporary HTML file from them. The new program would then call the parser to fetch all links in that HTML file. This might be the fastest way of getting the suggested functionality, although it would require that the user runs the new program themselves (or sets a cron job). However that could be marketed as a Feature because it gives them full control over when the missing links are retrieved. :) To specify maxdepths, file names and other options, a section with a certain name could be added to the usual Plucker config files by the user. The new program would refer to that section when calling the parser. Also (possibly as a future enhancement) the viewer could be modified to add a little pull-down "maxdepth" selection list to the "External Link" screen: before you click on the "Copy URL" button, you select the maxdepth for that particular link from the list, and it's saved in the Memo pad entry in some format that the new program can read and understand. I currently do something vaguely like this myself: I use pilot-link to download the Memo database in the form of emails, manually view the memo email folder in Mutt, filter for "Plucker URLs" items, and send those filtered items into a very quick and dirty Perl program which pulls out the URLs (and optional text that I might have added) and creates an HTML file (or appends them to the file if it already exists). A cron job then runs the normal Plucker parser on that HTML file once a day, using the default maxdepth of 2. It works very well for me so I think that a program to automate the pilot-link, Mutt and Perl steps would be useful for others. Alys -- Alice Harris Internet Services / ESD Operations, CITEC [EMAIL PROTECTED] [EMAIL PROTECTED] On Mon, Nov 26, 2001 at 11:25:19AM -0800, David A. Desrosiers wrote: > > > Example: I'm reading the downloaded news and I click on a link that was > > not downloaded. I select this link (via checkbox) I name the link i.e. > > "Link that was not downloaded the last time" and I select the depth of > > gathering the information. When I come to another non-downloaded link, I > > repeat the process. > > In order to do this, we need to actually store the string of > characters which make up the "out of bounds" URLs which were not fetched. > For a very large fetch, or a site which contained a lot of links, this could > be a considerable size. > > Then there's the --no-urlinfo complex. If it's used, you lose the > ability to retrieve those "out of bounds" urls. > > > All that while I'm using my Palm. And when I HotSync the pda, the > > Plucker downloads the newly made HTML page (i.e. "Extra pages that have > > to be downloaded") and uses it in its next session. Would that be > > possible? Is there any way you could implement this, while not needing > > to rewrite the whole program all over again? :) > > And this brings up another issue, which is that we don't currently > touch (update) the databases on the Palm with the parser on the desktop. In > order to do this, we would now require that the Palm be in the cradle at > *GATHER* time, or we'd have to cache off the Plucker databases on the > desktop. Both are not ideal, and would require a lot more space on the Palm, > assuming we use the Palm to add records. > > Alternately, we pull the database from the Palm, run a gather on the > desktop, comparing against what is in the database we just pulled from the > Palm, and then integrate those "missing" records. However, would you just > want to append those records? Or remove the ones already read, and then > replace them with the "out of bounds" records you checked? > > In order to do this, you need parser and viewer changes which > slightly change the architecture a bit, by adding a 360-degree sync > capability. The original Plucker implementation did this, with a local cache > directory and then actually created the PDB on the Palm, using the desktop > conduit, vs. the Python parser today which creates the PDB on the desktop, > which you then sync to your Palm with your desktop tools. > > > > /d >
Re: Question regarding your Plucker product
> Example: I'm reading the downloaded news and I click on a link that was > not downloaded. I select this link (via checkbox) I name the link i.e. > "Link that was not downloaded the last time" and I select the depth of > gathering the information. When I come to another non-downloaded link, I > repeat the process. In order to do this, we need to actually store the string of characters which make up the "out of bounds" URLs which were not fetched. For a very large fetch, or a site which contained a lot of links, this could be a considerable size. Then there's the --no-urlinfo complex. If it's used, you lose the ability to retrieve those "out of bounds" urls. > All that while I'm using my Palm. And when I HotSync the pda, the > Plucker downloads the newly made HTML page (i.e. "Extra pages that have > to be downloaded") and uses it in its next session. Would that be > possible? Is there any way you could implement this, while not needing > to rewrite the whole program all over again? :) And this brings up another issue, which is that we don't currently touch (update) the databases on the Palm with the parser on the desktop. In order to do this, we would now require that the Palm be in the cradle at *GATHER* time, or we'd have to cache off the Plucker databases on the desktop. Both are not ideal, and would require a lot more space on the Palm, assuming we use the Palm to add records. Alternately, we pull the database from the Palm, run a gather on the desktop, comparing against what is in the database we just pulled from the Palm, and then integrate those "missing" records. However, would you just want to append those records? Or remove the ones already read, and then replace them with the "out of bounds" records you checked? In order to do this, you need parser and viewer changes which slightly change the architecture a bit, by adding a 360-degree sync capability. The original Plucker implementation did this, with a local cache directory and then actually created the PDB on the Palm, using the desktop conduit, vs. the Python parser today which creates the PDB on the desktop, which you then sync to your Palm with your desktop tools. /d