Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-04 Thread pedram salehpoor
On Tue, Jun 3, 2008 at 4:23 PM, Edward Z. Yang 
[EMAIL PROTECTED] wrote:

 Ben Dilts wrote:
  I can't seem to find information on how to make complete XML files of
  the reference docs.  The en/reference/*/functions/*.xml files are not
  actually valid XML, as they don't have a DOCTYPE and don't define all
  the custom XML entities they use.
 
  How do I transform these sources into complete, valid XML documents?

 Use configure.php. An outstanding project is to make all of PHP's
 documentation sources standalone valid XML files, but for now, you'll
 need to glom them all together.

 It seems really an interesting project for making all of them standalone
 valid XML files.Where should we start it and what benefits It will have for
 phpdoc?
Regards.
Pedram



Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-04 Thread Philip Olson
 I can't seem to find information on how to make complete XML files  
of
 the reference docs.  The en/reference/*/functions/*.xml files are  
not
 actually valid XML, as they don't have a DOCTYPE and don't define  
all

 the custom XML entities they use.

 How do I transform these sources into complete, valid XML documents?

Use configure.php. An outstanding project is to make all of PHP's
documentation sources standalone valid XML files, but for now, you'll
need to glom them all together.


It seems really an interesting project for making all of them  
standalone valid XML files.Where should we start it and what  
benefits It will have for phpdoc?


This topic was recently discussed but I'm not sure where we're at on it.

  A PHPDoc DTD: Making subfiles validate:
http://markmail.org/message/fhn4onp6akaw5ltc

Regards,
Philip



Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-04 Thread Hannes Magnusson
On Wed, Jun 4, 2008 at 9:44 AM, pedram salehpoor
[EMAIL PROTECTED] wrote:


 On Tue, Jun 3, 2008 at 4:23 PM, Edward Z. Yang
 [EMAIL PROTECTED] wrote:
 Use configure.php. An outstanding project is to make all of PHP's
 documentation sources standalone valid XML files, but for now, you'll
 need to glom them all together.

 It seems really an interesting project for making all of them standalone
 valid XML files.Where should we start it and what benefits It will have for
 phpdoc?


We would need our own DTD (based on the DocBook5 DTD obviously) which
would do some magic including various entities we use.

The benefits include being able to parse each file standalone, meaning
it would truly be possible to update single files.
Along those lines we would be able to distribute separate downloads
for each extension for instance and lots of nifty features along those
lines.

It would be great if we could switch over to XInclude in the way...

-Hannes


Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-04 Thread Edward Z. Yang
Philip Olson wrote:
 This topic was recently discussed but I'm not sure where we're at on it.

As far as I can tell (being the one who wrote the proposal) we're
waiting for someone to do it, or for my time to free up so that I can
sit down and get started on it.

-- 
 Edward Z. YangGnuPG: 0x869C48DA
 HTML Purifier http://htmlpurifier.org Anti-XSS Filter
 [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]


[PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread Ben Dilts
I maintain a PHP IDE, and scrape php.net's documentation periodically for 
information on built-in functions, classes, constants, etc. using regular 
expressions.  The problem is, the actual HTML syntax changes periodically.


Is there any way for me to access the source data that is used to produce 
those manual pages?  My results would be better, my development time would 
go down, and it would save php.net a crawl's worth of bandwidth weekly.


Thanks!


Ben Dilts 



Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread Philip Olson


On 3 Jun 2008, at 11:31, Ben Dilts wrote:

I maintain a PHP IDE, and scrape php.net's documentation  
periodically for information on built-in functions, classes,  
constants, etc. using regular expressions.  The problem is, the  
actual HTML syntax changes periodically.


Is there any way for me to access the source data that is used to  
produce those manual pages?  My results would be better, my  
development time would go down, and it would save php.net a crawl's  
worth of bandwidth weekly.


Hello Ben,

This comes up from time to time and although I don't remember  
specifics on what we discussed... here are a few words:


Current situation:
- We have various generated .xml and .txt files in CVS
- But we no longer generate them, nor do we trust how we generate them
- They are generated from PHP internal sources and not from the manual
- They don't really have a home, except through CVS
- Unfortunately people tend to instead scrape the manual, both http  
and downloadable html


Likely future situation:
- We'll use PhD to generate a friendly format for this
- They'll be hosted/offered outside of CVS
- We need to discuss this format
- We could also add a list of keywords like constants, predefined  
variables, etc.


Other considerations:
- PECL: Most PECL extensions are lightly used so may be seen as  
unnecessary information
- Not everything is documented, so generating from the manual misses  
those

- Whether or not to also scrape php-src (or only the php manual sources)

But the PHP Manual XML sources are all available in CVS so feel free  
to check them out and parse. Read:


  http://php.net/about.generate

And to download via CVS, run this command:

  cvs -d :pserver:[EMAIL PROTECTED]:/repository co phpdoc

Regards,
Philip



Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread Philip Olson


On 3 Jun 2008, at 15:00, Ben Dilts wrote:


CVS access to the XML is just what I needed.

Even though the XML may not be really trusted anymore, it's certain  
to be more stable in its general format than the HTML, so I'll work  
on something to scrape over that.


Just to clarify, our XML sources are very trustworthy :) But since we  
used to generate a few files from php-src (the PHP sources (C  
files...)) some people used those in the past but I don't recommend  
it. For example, these:


  http://cvs.php.net/viewvc.cgi/phpdoc/funcindex.xml
  http://cvs.php.net/viewvc.cgi/phpdoc/funcsummary.txt.

Regards,
Philip


Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread Ben Dilts

CVS access to the XML is just what I needed.

Even though the XML may not be really trusted anymore, it's certain to be 
more stable in its general format than the HTML, so I'll work on something 
to scrape over that.



Ben Dilts

Philip Olson [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]


On 3 Jun 2008, at 11:31, Ben Dilts wrote:

I maintain a PHP IDE, and scrape php.net's documentation  periodically 
for information on built-in functions, classes,  constants, etc. using 
regular expressions.  The problem is, the  actual HTML syntax changes 
periodically.


Is there any way for me to access the source data that is used to 
produce those manual pages?  My results would be better, my  development 
time would go down, and it would save php.net a crawl's  worth of 
bandwidth weekly.


Hello Ben,

This comes up from time to time and although I don't remember  specifics 
on what we discussed... here are a few words:


Current situation:
- We have various generated .xml and .txt files in CVS
- But we no longer generate them, nor do we trust how we generate them
- They are generated from PHP internal sources and not from the manual
- They don't really have a home, except through CVS
- Unfortunately people tend to instead scrape the manual, both http  and 
downloadable html


Likely future situation:
- We'll use PhD to generate a friendly format for this
- They'll be hosted/offered outside of CVS
- We need to discuss this format
- We could also add a list of keywords like constants, predefined 
variables, etc.


Other considerations:
- PECL: Most PECL extensions are lightly used so may be seen as 
unnecessary information
- Not everything is documented, so generating from the manual misses 
those

- Whether or not to also scrape php-src (or only the php manual sources)

But the PHP Manual XML sources are all available in CVS so feel free  to 
check them out and parse. Read:


  http://php.net/about.generate

And to download via CVS, run this command:

  cvs -d :pserver:[EMAIL PROTECTED]:/repository co phpdoc

Regards,
Philip





Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread Ben Dilts
I can't seem to find information on how to make complete XML files of the 
reference docs.  The en/reference/*/functions/*.xml files are not actually 
valid XML, as they don't have a DOCTYPE and don't define all the custom XML 
entities they use.


How do I transform these sources into complete, valid XML documents?


Ben Dilts

Philip Olson [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]


On 3 Jun 2008, at 15:00, Ben Dilts wrote:


CVS access to the XML is just what I needed.

Even though the XML may not be really trusted anymore, it's certain  to 
be more stable in its general format than the HTML, so I'll work  on 
something to scrape over that.


Just to clarify, our XML sources are very trustworthy :) But since we 
used to generate a few files from php-src (the PHP sources (C  files...)) 
some people used those in the past but I don't recommend  it. For example, 
these:


  http://cvs.php.net/viewvc.cgi/phpdoc/funcindex.xml
  http://cvs.php.net/viewvc.cgi/phpdoc/funcsummary.txt.

Regards,
Philip 




Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread Edward Z. Yang
Ben Dilts wrote:
 I can't seem to find information on how to make complete XML files of
 the reference docs.  The en/reference/*/functions/*.xml files are not
 actually valid XML, as they don't have a DOCTYPE and don't define all
 the custom XML entities they use.
 
 How do I transform these sources into complete, valid XML documents?

Use configure.php. An outstanding project is to make all of PHP's
documentation sources standalone valid XML files, but for now, you'll
need to glom them all together.

-- 
 Edward Z. YangGnuPG: 0x869C48DA
 HTML Purifier http://htmlpurifier.org Anti-XSS Filter
 [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]


Re: [PHP-DOC] Machine-readable PHP documentation?

2008-06-03 Thread G. T. Stresen-Reuter

On Jun 4, 2008, at 12:23 AM, Edward Z. Yang wrote:


Ben Dilts wrote:

I can't seem to find information on how to make complete XML files of
the reference docs.  The en/reference/*/functions/*.xml files are not
actually valid XML, as they don't have a DOCTYPE and don't define all
the custom XML entities they use.

How do I transform these sources into complete, valid XML documents?


Use configure.php. An outstanding project is to make all of PHP's
documentation sources standalone valid XML files, but for now, you'll
need to glom them all together.



And then look for .manual.xml (not the dot at the start of the name  
- it is a hidden file)


Ted Stresen-Reuter
http://tedmasterweb.com/php-bbedit-clipping-set/