style

2001-10-17 Thread Bill Nalen/Towers Perrin


This tag doesn't seem to be handled in my plucker (version 1.1 I think).  I
think it requires a start_style and end_style in TextParser.py where each
function behaves like the head (i.e. turn off visibility on start, turn it
on in end).  I think I checked out the latest cvs version and didn't see it
in there either.  I may have missed something though.

Bill





Re: character sets in HTML files?

2001-10-17 Thread MJ Ray

Bill Janssen [EMAIL PROTECTED] writes:

 I've been reading the HTTP and HTML specs about character sets.

Shouldn't you be using the xhtml specs now?
-- 
MJR



Re: character sets in HTML files?

2001-10-17 Thread Bill Janssen

  I've been reading the HTTP and HTML specs about character sets.
 
 Shouldn't you be using the xhtml specs now?
 -- 
 MJR

As soon as we add an XML component to the parser...  It's on my list.

Actually, if you read the XHTML specs, you'll see that they refer you
back to the HTML specs for many, even most, things.

Bill




Re: making plucker-build easier to use?

2001-10-17 Thread Dirk Heiser

Bill == Bill Janssen [EMAIL PROTECTED] writes:

 Is there are a way to tell Python to output this in Binary mode?

Bill If you want to try it, change line 494 in parser/python/PyPlucker/Writer.py
Bill from
Bill self._pdb_file = prc.File (sys.stdout, read=0, write=1)
Bill to
Bill self._pdb_file = prc.File (os.fdopen(sys.stdout.fileno(), 'wb'), 
read=0, write=1)

Bill Seems to work for me.

Sorry, it does not work :-(

BTW: Calling:

 plucker-build file://C:\test\textfile.txt  out

Result in the DB name file://C:\test\textfile.txt and if its to
large it will be shorten from the _biginning_.

Calling:

 plucker-build -H file://C:\test\textfile.txt -f out

The DB are named out (Should it not be named textfile?)

(With Databasename i'm talking about the name on the palm)

cu,
 Dirk

-- 
Permanent URLs to the latest Version (1.1.13) of the Plucker Windows installer
 - For the Webpage: http://www.dirk-heiser.de/plucker
 - Direct Download: http://www.dirk-heiser.de/plucker/plucker.exe [2.08MB]



Re: added charset info to plucker doc; metadata record type

2001-10-17 Thread Dirk Heiser

Bill == Bill Janssen [EMAIL PROTECTED] writes:

 If i call locale.py i get this output:
   
   Language:  de_DE
   Encoding:  cp1252
   

Bill I've just looked at the base Python code in 1.5.2 for locale.py so

FYI: This are the output from python 2.0

 i get:
 
 ---
 Converted plucker:/home.html
 Default charset is 1252
 1252
 0
 Converted plucker:/~special~/index
 --
 
 and 04E4 (1252) are written in the plucker DBs metadata record. But
 the http://www.iana.org/assignments/character-sets say:

Bill I've tried to duplicate this, but just get an error saying that 'cp1252'
Bill doesn't name a charset.

I guess its because you see every only number charset as an MIBenum.

BTW: why you expect that the return from getlocale are in MIBenum? And
whats if the user specify 427, is it the charset name or the
MIBenum? Need we a way to specify the MIBenum or is it enought to
allow to specify the charset name?

 And at the end one idea: Whats about writing a parser that parse the
 http://www.iana.org/assignments/character-sets and create and complete
 NamedCharsets array? It seams to be easy and i think i could do this.

Bill Great idea!

I have checked in this. Please take a look if it is OK.

cu,
 Dirk

-- 
Permanent URLs to the latest Version (1.1.13) of the Plucker Windows installer
 - For the Webpage: http://www.dirk-heiser.de/plucker
 - Direct Download: http://www.dirk-heiser.de/plucker/plucker.exe [2.08MB]



more charset work checked in

2001-10-17 Thread Bill Janssen

I've updated the CVS with my more-or-less final charset work.  Please
check it out and try it.

Note that this does not involve any viewer changes, just parser stuff.

Bill




Re: making plucker-build easier to use?

2001-10-17 Thread Bill Janssen

 BTW: Calling:
 
  plucker-build file://C:\test\textfile.txt  out
 
 Result in the DB name file://C:\test\textfile.txt and if its to
 large it will be shorten from the _biginning_.

Yes.  What would you suggest?  I thought that shortening it from the
front made more sense than cutting off the end.  I don't see another
reasonable way to construct a name for the DB, either, but am open to
suggestions.  Of course, --doc-name can still be used to specify
a document name:

  plucker-build --doc-name=textfile file://C:\test\textfile.txt  out

 Calling:
 
  plucker-build -H file://C:\test\textfile.txt -f out
 
 The DB are named out (Should it not be named textfile?)

Well, the way it's always worked is that the DB filename is used,
unless a doc-name is explicitly specified.  So I'd say it's still
working correctly.

Bill



Re: added charset info to plucker doc; metadata record type

2001-10-17 Thread Bill Janssen

Looks great, Dirk.  I'll switch over the code in TextParser.py to use it.

Bill



Re: making plucker-build easier to use?

2001-10-17 Thread Dirk Heiser

Bill == Bill Janssen [EMAIL PROTECTED] writes:

 BTW: Calling:
 
  plucker-build file://C:\test\textfile.txt  out
 
 Result in the DB name file://C:\test\textfile.txt and if its to
 large it will be shorten from the _biginning_.

Bill Yes.  What would you suggest?  I thought that shortening it from the
Bill front made more sense than cutting off the end.  I don't see another

What about to use the filename only (cut of the drive, path and
extension) and if it still to long cut from the end?

At least the both ways to call the parser (-f filename or 
filename) should end in the _same_ DBname.

BTW: since the writing to stdout does not work on windows (and i guess
also not on OS/2 on MAC) what about writing an sys.platform ==
'linux' around this stdout code, so the parser do not create broken
DBs on non LINUX systems?

cu,
 Dirk

-- 
Permanent URLs to the latest Version (1.1.13) of the Plucker Windows installer
 - For the Webpage: http://www.dirk-heiser.de/plucker
 - Direct Download: http://www.dirk-heiser.de/plucker/plucker.exe [2.08MB]



Re: more charset work checked in

2001-10-17 Thread Bill Janssen

OK, I've now updated the code to use Dirk's generated table of charset
names.

Bill




Re: making plucker-build easier to use?

2001-10-17 Thread Bill Janssen

 BTW: since the writing to stdout does not work on windows (and i guess
 also not on OS/2 on MAC) what about writing an sys.platform ==
 'linux' around this stdout code, so the parser do not create broken
 DBs on non LINUX systems?

I'm going to pursue this a bit further on Windows, first.  There must
be some way to write binary output.

 What about to use the filename only (cut of the drive, path and
 extension) and if it still to long cut from the end?

I suppose we could do that.  I'd just as soon save as much of the
earlier info as possible.

Bill




Re: making plucker-build easier to use?

2001-10-17 Thread Bill Janssen

 I'm going to pursue this a bit further on Windows, first.  There must
 be some way to write binary output.

And indeed, the Python Cookbook
(http://aspn.activestate.com/ASPN/Cookbook/Python) has the answer.
Here it is:

 import sys

 if sys.platform == win32:
 import os, msvcrt
 msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

I've incorporated that into the code.  Try it out!

Bill