New Unofficial Jpluck build

2004-04-14 Thread Lambert, Mark
This one has added support for bookmark conversion from DOC files.
It also spits out the html it generates to \\palmdochandlertest.html if
you want to suggest  instead of  or something.

The jar to replace is jpluck2.jar
It is accessible now from xmission.com/~mlambert/jpluck2.jar

-- 
"In the digital world, we don't need back-ups, because a digital copy
never wears out."
  --Jack Valenti, head of the Motion Picture Association of America. 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: JPluck and DOC files

2004-04-16 Thread Lambert, Mark
Download the jpluck2.jar from http://xmission.com/~mlambert/jpluck2.jar
.
Rename the jpluck2.jar in Jpluck 2/build and put the new one in.
Run it.  
It also adds the ability to drag a file to the URL field when creating a
new site.
You should be able to pluck DOC files now.  If you can't; let me know.
One bit of debugging code is still in there - it places
\\palmdochandlertest.html in the root directory of the drive that Jpluck
is installed on.  That is so I(you) can look at the raw HTML that was
used to create the plucker document.

Mark

PS. I emailed Laurens about integrating it again yesterday.






E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: JPluck and DOC files

2004-04-16 Thread Lambert, Mark
If you could, run it like D:\Java\JDK1.4\bin\java.exe -jar "C:\Program
Files\JPluck 2\jpluckx.jar" so that it shows the console output.  That
will give me a better idea of what is happening and if it actually is in
the DOC format.  The one I would also really like to do is Mobi format
because I have a couple of books in the format, but I haven't been able
to find any converters.

Mark

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Jeffrey A. Krzysztow
> Sent: Friday, April 16, 2004 11:58 AM
> To: [EMAIL PROTECTED]
> Subject: Re: JPluck and DOC files
> 
> Palm Reader can read DOC files, but also they have a 
> proprietary format that cannot be converted.
> 
> Jeffrey
> 
> Branko Strok said the following on 4/16/2004 12:46 PM:
> 
> >Mark and Jeff,
> >
> >It didn't work.
> >The pdb was ment for a Palm Reader and I assumed that it's 
> in the DOC 
> >format!?
> >Is that right or I miised on that point?
> >
> >[12:44:18 PM] INFO: Starting conversion: EndofLife (C:\JPluck 
> >2\index.jxl)
> >
> >[12:44:18 PM] INFO: Retrieved: file:/C:/ebooks/ENDOFLIF.PDB
> >
> >[12:44:18 PM] WARNING: content/unknown content not handled.
> >
> >[12:44:18 PM] SEVERE: EndofLife: could not load starting URI 
> >file:/C:/ebooks/ENDOFLIF.PDB. Document cannot be generated.
> >
> >_
> >FREE pop-up blocking with the new MSN Toolbar - get it now! 
> >http://toolbar.msn.com/go/onm00200415ave/direct/01/
> >
> >___
> >plucker-dev mailing list
> >[EMAIL PROTECTED]
> >http://lists.rubberchicken.org/mailman/listinfo/plucker-dev
> >
> >  
> >
> ___
> plucker-dev mailing list
> [EMAIL PROTECTED]
> http://lists.rubberchicken.org/mailman/listinfo/plucker-dev
> 
> 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: annotation branch

2004-04-20 Thread Lambert, Mark
I'll try it if you can send me a binary or tell me where to get one.
I'm not set up to compile it right now.

Mark 






E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: annotation branch binary

2004-04-21 Thread Lambert, Mark
Got it, but can't find how to do an annotation.
Any help?

Thanks,
Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


Feature Request

2004-04-21 Thread Lambert, Mark
It would be nice to be able to set up separate icons for various books
so that clicking on them would take you directly to the book instead of
where you were last.

Mark

-- 
"In the digital world, we don't need back-ups, because a digital copy
never wears out."
  --Jack Valenti, head of the Motion Picture Association of America. 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Feature Request

2004-04-21 Thread Lambert, Mark
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> David A. Desrosiers
> Sent: Wednesday, April 21, 2004 11:44 AM
> To: Plucker Development List
> Subject: Re: Feature Request
> 
> 
> > It would be nice to be able to set up separate icons for 
> various books 
> > so that clicking on them would take you directly to the 
> book instead 
> > of where you were last.
> 
>   You can, just use the launchable flag, and change the 
> icon used for the launcher when creating the documents.
> 

Thanks - can you repeat that in smaller words and more steps?





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Feature Request

2004-04-21 Thread Lambert, Mark
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Lambert, Mark 
> Sent: Wednesday, April 21, 2004 11:52 AM
> To: [EMAIL PROTECTED]
> Subject: RE: Feature Request
> 
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of David A. 
> > Desrosiers
> > Sent: Wednesday, April 21, 2004 11:44 AM
> > To: Plucker Development List
> > Subject: Re: Feature Request
> > 
> > 
> > > It would be nice to be able to set up separate icons for
> > various books
> > > so that clicking on them would take you directly to the
> > book instead
> > > of where you were last.
> > 
> > You can, just use the launchable flag, and change the 
> icon used for 
> > the launcher when creating the documents.
> > 
> 
> Thanks - can you repeat that in smaller words and more steps?
> 

Replying to myself.
Ah, I found the information on the launchable flag for the windows
conduit.  Now I just need to find out how to set it with Jpluck or on my
palm.
If I do this for documents on the VFS, will I need to move all of my
books somewhere specific?





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Feature Request

2004-04-21 Thread Lambert, Mark
> Replying to myself.
> Ah, I found the information on the launchable flag for the 
> windows conduit.  Now I just need to find out how to set it 
> with Jpluck or on my palm.
> If I do this for documents on the VFS, will I need to move 
> all of my books somewhere specific?
> 
OK, using utilities like Filez/ZCat/Tcat/etc. I can get it working on my
Palm.
I can't seem to get it working on any documents on VFS.  Anyone done it
or know if it is possible?  I have 2 books that I use regularly that are
8M each.

Thanks,
Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: annotation branch binary

2004-04-21 Thread Lambert, Mark
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Alexander R. Pruss
> Sent: Wednesday, April 21, 2004 3:02 PM
> To: [EMAIL PROTECTED]
> Subject: RE: annotation branch binary
> 
> On Wed, 21 Apr 2004, Lambert, Mark  wrote:
> > Got it, but can't find how to do an annotation.
> > Any help?
> 
> Assign the annotation action (which may be something like 
> $$ACTION: ...) to something, e.g., to a key, press it and tap 
> on a word, or set the lookup pref to do annotations.
> 

Today must be my stupid day.  I have 4 $$SELECT:LOOKUP items.
  They seem to work, but when I click on an highlighted annotation and
select Delete, sometimes they all go away.  In another case, I annotated
the first word and it didn't hilight; however if I annotated it again,
the note was still there.







E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Feature Request

2004-04-21 Thread Lambert, Mark
Sorry I am the whole list today.  If I set a plucker document as
launchable in 1.7.1 (annotation build if it makes a difference) I get a
Fatal Alert when I select it in Form.c Line:5094, No form to return to

If anyone can dupe it, I'll enter it into the bug database.
Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


Progress note

2004-05-19 Thread Lambert, Mark
I haven't made a whole lot of headway on extended table support
for Jpluck, but I have made some.  In the process, I have decided I
would like to add my vote to somehow speed up table rendering (i.e. font
change or something).
On a bit happier note one of the reasons that I haven't made a
whole lot of progress on the table support is because I am almost done
with Mobi support and have started Peanut document support.  Currently
the Mobi support will extract the HTML and images and will fixup the
image references in the HTML to the correct image names.  The biggest
difficulty so far is fixing up the internal links in the documents but
that is almost done.

One quick poll if you will.  It may be possible for me to include
support for encrypted Peanut files.  The catch (both good and bad) is
you will have to somehow supply the Registration name and CC info to
convert it.  Is this something that would be helpful or something that I
would be better left out?

Mark

-- 
"In the digital world, we don't need back-ups, because a digital copy
never wears out."
  --Jack Valenti, head of the Motion Picture Association of America. 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Progress note

2004-05-19 Thread Lambert, Mark
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Jeffrey A. Krzysztow
> Sent: Wednesday, May 19, 2004 4:04 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Progress note
> 
> 
> Lambert, Mark said the following on 5/19/2004 9:45 AM:
> 
> Great! They have a large library of docs. Will it decrypt?
> 

Most things (that I have tried) it will decrypt fine.

> >One quick poll if you will.  It may be possible for me to include 
> >support for encrypted Peanut files.  The catch (both good 
> and bad) is 
> >you will have to somehow supply the Registration name and CC info to 
> >convert it.  Is this something that would be helpful or 
> something that 
> >I would be better left out?
> >
> >
> >  
> >
> Would be cool to be able to generate Peanut Press Content 
> without their programs and prices.
> 
> Will this get committed to the main JPluck? Did the PalmDoc 
> converter ever make it?

Probably not.  Laurens wants to keep JPluck as a web content converter
as opposed to a general purpose Plucker document creator.

Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Progress note

2004-05-20 Thread Lambert, Mark
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Alexander R. Pruss
> Sent: Wednesday, May 19, 2004 5:44 PM
> To: [EMAIL PROTECTED]
> Subject: RE: Progress note
> 
> Wouldn't allowing decryption of Peanut Press stuff be a DMCA 
> liability?  
> I guess it may be a question of how the texts are 
> licensed--does the text license only permit decryption on 
> their reader, or does it permit decryption on any compatible reader?
> 
> Alex
> 

I don't know and that is why I asked.  I have never used any of their
stuff.  One thing in our favor is that the license information (name and
CC#) is required to decrypt it.  In other words we would not be
'circumventing' a protection device, but using it with the license
information.

Mark






E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Progress note (other formats)

2004-05-20 Thread Lambert, Mark

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Jewett, Jim J
> Sent: Thursday, May 20, 2004 8:26 AM
> To: '[EMAIL PROTECTED]'
> Subject: Re: Progress note (other formats)
> 
> Lambert, Mark:
> 
> > I would like to add my vote to somehow speed up table 
> rendering (i.e. 
> > font change or something).
> 
> Are all tables slow, or nested tables, or tables with a font 
> different from the default, or ..?
> 
> (I don't (currently) often read channels with tables, and 
> never markup tables, so I don't see the problem myself ... 
> which is why I'm asking for clarification.)
> 

The site that made tables painful for me was the Wheel of Time FAQ
(http://www.steelypips.org/wotfaq/) and try the section 2.7.5.  I am
using font NimbusSans15.

> > I am almost done with Mobi support and have started Peanut document 
> > support.
> 
> I am correct that this is a server- (PC-) side tranformation, 
> rather than letting the viewer read the format natively?
> 

Yes

> > It may be possible for me to include support for encrypted Peanut 
> > files.  The catch (both good and bad) is you will have to somehow 
> > supply the Registration name and CC info to convert it.  Is this 
> > something that would be helpful or something that I would be better 
> > left out?
> 
> Supply it to what?  To a Peanut webserver that your code 
> verifies against, or supply it to a decryption routine as 
> part of the key?
> 

Decryption routine.  It is actually available in the public domain at
http://membres.lycos.fr/pc1/

> In general, I would say yes, but I can understand reluctance 
> to release code that lets people read (and copy) documents 
> they haven't paid for.  If the CC is part of the decryption 
> key, that shouldn't be a problem; the number stays on their 
> own machine, and you don't ask for anything more than the 
> standard reader does.  
> 
> Also, IIRC, JPluck has a BSD license, so you *could* keep 
> part of the source hidden, or even use a plugin, if that 
> would satisfy Peanut Press.
> 
> But once the document is decrypted, will you need to 
> reencrypt the plucker format version?  Would you need to 
> include the CC information in that too somehow, to discourage 
> beaming?  (As part of the owner string?  Just as text?)
> 

This was mentioned before and it might suffice.  I would hesitate to
include the CC information though.  It would deter copying, but it would
also get left around because of backups, etc. and I am very wary of
identity and CC theft. 

Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


DOC and Mobi conversion

2004-06-14 Thread Lambert, Mark
(Resend because of the flood)

I have a test version of the DOC and Mobi converter.  I have made a
simple standalone jar that you can just pass in the file name on the
command line for now.  Laurens doesn't want to integrate it with JPluck
and so this may be a better solution in the long run.  It will extract
the text as HTML with any images and bookmarks and saves them all in the
same directory as the original file.  It also tries to fixup links in
the document.  It currently doesn't fix links correctly that go to
separate original documents.  I also do not have the Peanut stuff
working yet in part because I have no Peanut files to test with.  Maybe
after I finish the current stuff I will work on adding unplucking.

By the way, because of some bugs in ImageIO and some better BMP support,
documents with images come out much better if you run it with the 1.5
betas.  I have tried betas 1 and 2, but not the beta3 snapshots.

It is at http://www.xmission.com/~mlambert/UnDocker.jar (note case)

Let me know how it works for you or if you would like a copy of the
source.

Mark

-- 
"In the digital world, we don't need back-ups, because a digital copy
never wears out."
  --Jack Valenti, head of the Motion Picture Association of America. 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


Desktop vs JPluck

2004-06-14 Thread Lambert, Mark
I have been trying to compare the Desktop and Jpluck a little recently
and I have found one thing that puzzles me.  If I pluck a large document
with Jpluck the resulting file is 9,561,208 bytes, but if I pluck it in
the Desktop (with matching settings as far as I can tell) it is
12,348,756 bytes in addition to taking over 4 times as long.  Are others
seeing similar results with large documents with images?  I realize that
Jpluck is missing a number of features that the Desktop has (Enhanced
tables, executable flag, some of the image and output options), but if
this holds true it might be something that could be improved it the
Desktop.
Unfortunately, I know zero python so I can't help much there.

Mark


-- 
"In the digital world, we don't need back-ups, because a digital copy
never wears out."
  --Jack Valenti, head of the Motion Picture Association of America. 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Desktop vs JPluck

2004-06-15 Thread Lambert, Mark
> [mailto:[EMAIL PROTECTED] On Behalf Of Alan Hoyle
> Sent: Monday, June 14, 2004 5:24 PM
> 
> On Mon, 14 Jun 2004 at 16:38, Lambert, Mark  wrote:
> 
> > If I pluck a large document with Jpluck the resulting file is 
> > 9,561,208 bytes, but if I pluck it in the Desktop (with matching 
> > settings as far as I can tell) it is 12,348,756 bytes in 
> addition to 
> > taking over 4 times as long.
> 
> I don't know about size, but JPluck can do multiple 
> simultaneous http connections which can great accelerate the 
> downloading part.
> 

This is 751 files that are on my local disk. Jpluck takes 39 seconds and
Desktop takes 5:40 to complete.  Two other interesting notes; Jpluck
lists 1182 files and Desktop list 1313 files.  Desktop also takes over
20 seconds just to write out the file.

Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Desktop vs JPluck

2004-06-15 Thread Lambert, Mark
Understand that I am not trying to start a good versus bad or Python
versus Java discussion with this topic.  I am, however, trying to point
out the various weaknesses in the tools.  In my mind the best scenario
would be that both tools would produce very similar output in about the
same amount of time.  What I don't like seeing is a 20% or more
difference in output size when the input is fairly simple html with
images.  If someone has a working unplucker build for windows I can
extract both versions and see what the major differences are.  If not,
I'll try and get the code from CVS and build one this week.

Trying to improve the product,

Mark






E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


New version of UnDOCer

2004-07-30 Thread Lambert, Mark
This one only runs on Java 1.5 for now because of some severe image
processing bugs in 1.4.
I have included with the JAR all of the source for anyone who wants to
look at it (and fix bugs).
It still has the bug where it will lock up on certain files while trying
to clean up the REFs.  To get around this it outputs _RAW.html
before it begins this step.

The (case-sensitive) URL is http://xmission.com\~mlambert\UnDOCer.jar

Any feedback is appreciated.

Mark

PS. To run it type either UnDOCer.jar  or java -jar
UnDOCer.jar 

-- 
"In the digital world, we don't need back-ups, because a digital copy
never wears out."
  --Jack Valenti, head of the Motion Picture Association of America. 





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: David and group has taken over JPluck, so to speak...

2004-09-14 Thread Lambert, Mark
Jeffrey,
Mail to you keeps bouncing so...

I knew, and hope to help when I have more time. (I am currently involved
in starting up a Charter School and the charter is due to the Utah State
Legislature next week).  I would be happy to have the utility included
somehow (The core is your code anyway).  I'll see if I can wrap up some
of the stuff I was working on over the next week so that it is fit for
public consumption.

Mark
-- 
"Homemaker is the ultimate career. All other careers exist for one
purpose only -- to support the ultimate career!" C.S. Lewis

> -Original Message-
> From: Jeffrey A. Krzysztow [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, September 09, 2004 7:30 PM
> To: Lambert, Mark 
> Subject: David and group has taken over JPluck, so to speak...
> 
> Mark,
> 
> David and group has taken over JPluck, so to speak and I do 
> not know if you were invited or not, but if you are 
> interested, I would ask David.
> 
> But, the reason I was contacting you, is if the group does 
> not mind, I'd like to see if the utility you wrote (UnDOCer) 
> could be built into JPluck directly or at least included with JPluck.
> 
> Jeffrey
> 
> 





E-Mail messages may contain viruses, worms, or other malicious code. By
reading the message and opening any attachments, the recipient accepts
full responsibility for taking protective action against such code.
Sender is not liable for any loss or damage arising from this message.

The information in this e-mail is confidential and may be legally
privileged. It is intended solely for the addressee(s). Access to this
e-mail by anyone else is unauthorized.





E-Mail messages may contain viruses, worms, or other malicious code. By reading the 
message and opening any attachments, the recipient accepts full responsibility for 
taking protective action against such code. Sender is not liable for any loss or 
damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. It is 
intended solely for the addressee(s). Access to this e-mail by anyone else is 
unauthorized.
___
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Plucker server on Project Gutenberg

2005-11-02 Thread Lambert, Mark
I don't know if this would help or not, but I always go off the HTML
version and break on any H1 or H2.  That isn't perfect either, but is
easier to do. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Marcello
Perathoner
Sent: Wednesday, November 02, 2005 10:10 AM
To: plucker-dev@rubberchicken.org
Subject: Re: Plucker server on Project Gutenberg

David A. Desrosiers wrote:

>> I'm going to replace the text/plain parser with a custom one that 
>> will (try to) parse chapter heads, italics etc. out of the plain
text.
> 
> I'd be interested to see how you solve the context issue that has 
> been brought up on the pg lists over the last year or so. Its a very 
> complicated issue, and to date, nobody has solved it without trying to

> reinvent the base PG text format into something different.

I have the option of doing:

   pgtext > filter | PyPlucker > pdb

or

   to write a custom parser for PyPlucker.


The PG format has changed a lot over 30+ years. None of the 3rd-party
tools I know is able to correctly parse all PG texts.

The custom text/plain parser I'm writing will plug into PyPlucker and do
a very simple analysis of the text. I'm not aiming at a 100% or even 99%
solution. I'm just trying to make the average PG text look good enough
for distribution.




--
Marcello Perathoner
[EMAIL PROTECTED]

___
plucker-dev mailing list
plucker-dev@rubberchicken.org
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev





E-Mail messages may contain viruses, worms, or other malicious code. By reading 
the message and opening any attachments, the recipient accepts full 
responsibility for taking protective action against such code. Sender is not 
liable for any loss or damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. 
It is intended solely for the addressee(s). Access to this e-mail by anyone 
else is unauthorized.

___
plucker-dev mailing list
plucker-dev@rubberchicken.org
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Plucker server on Project Gutenberg

2005-11-02 Thread Lambert, Mark
>On Behalf Of Marcello Perathoner
>Sent: Wednesday, November 02, 2005 1:37 PM
>To: plucker-dev@rubberchicken.org
>Subject: Re: Plucker server on Project Gutenberg
>
>Lambert, Mark wrote:
>
>> I don't know if this would help or not, but I always go off the HTML 
>> version and break on any H1 or H2.  That isn't perfect either, but is

>> easier to do.
>
>Not all PG ebooks have an HTML version.

True, and then I have to use regex to break things up and each book is
different... 
But it is low-hanging fruit that would make it simpler for those that
have HTML.

(.*)
is much easier than
^(CHAPTER .*|BOOK .*|PART .*|PROLOGUE|EPILOGUE|ABOUT THE
AUTHOR|GLOSSARY|DRAMATIS PERSONA|CHARACTERS)$

Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading 
the message and opening any attachments, the recipient accepts full 
responsibility for taking protective action against such code. Sender is not 
liable for any loss or damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. 
It is intended solely for the addressee(s). Access to this e-mail by anyone 
else is unauthorized.

___
plucker-dev mailing list
plucker-dev@rubberchicken.org
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev


RE: Plucker server on Project Gutenberg

2005-11-03 Thread Lambert, Mark
 
> From: Marcello Perathoner
> Lambert, Mark wrote:
> 
> > But it is low-hanging fruit that would make it simpler for 
> those that 
> > have HTML.
> 
> If they have HTML, of course I use HTML. But more than half 
> of them don't.
> 

No worries.  I wasn't sure if you were or not so I thought I'd mention
it.  I read 2+ books a week on my palm and always have my eye out for
ways to make it easier. For example, I never work off TXT I always use
txt2html on it first, I always replace ellipses with with
…(sometimes thousands in a book) to make the file a little
smaller? and because I like the look better, break books up by
chapter(significantly speeds up display on my Clie) usually with
htmlsplitter (rekenwonder.com), etc.   
This week I have read Knife of Dreams(Library), The Penultimate
Peril(Own), Glory Road(Library), Plague Ship(Gutenberg), and Old
Nathan(Bean Free Library) on my Clie.

Mark





E-Mail messages may contain viruses, worms, or other malicious code. By reading 
the message and opening any attachments, the recipient accepts full 
responsibility for taking protective action against such code. Sender is not 
liable for any loss or damage arising from this message.

The information in this e-mail is confidential and may be legally privileged. 
It is intended solely for the addressee(s). Access to this e-mail by anyone 
else is unauthorized.

___
plucker-dev mailing list
plucker-dev@rubberchicken.org
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev