Re: [VOTE] Accept PDFBox for incubation

2008-02-10 Thread Niclas Hedhman
On Friday 01 February 2008 22:18, Jukka Zitting wrote:
> Please vote on accepting the PDFBox project for incubation. 

[x] +1 Accept PDFBox as a new podling

Cheers
-- 
Niclas Hedhman, Software Developer

I  live here; http://tinyurl.com/2qq9er
I  work here; http://tinyurl.com/2ymelc
I relax here; http://tinyurl.com/2cgsug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-07 Thread Jukka Zitting
Hi,

On Feb 7, 2008 3:27 PM, Emmanuel Lecharny <[EMAIL PROTECTED]> wrote:
> Jukka Zitting wrote:
> > +1 Emmanuel Lecharny (non-binding)
> >
> May be I missed a step... I have been Ack'd on jan, 20 as an incubator
> PMC member. Any heads up ?

Ah, sorry about that. I was looking at committee-info.txt.

BR,

Jukka Zitting

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-07 Thread Emmanuel Lecharny

Jukka Zitting wrote:

Hi,

  



+1 Emmanuel Lecharny (non-binding)
  
May be I missed a step... I have been Ack'd on jan, 20 as an incubator 
PMC member. Any heads up ?


Thanks !

--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-07 Thread Jukka Zitting
Hi,

On Feb 6, 2008 6:35 PM, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
> BTW, Harmony has native font library. Probably it's worth to cooperate
> these efforts.

Good point, thanks! Let's see what synergies we can come up with.

BR,

Jukka Zitting

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-07 Thread Jukka Zitting
Hi,

On Feb 1, 2008 4:18 PM, Jukka Zitting <[EMAIL PROTECTED]> wrote:
> Please vote on accepting the PDFBox project for incubation.
> [...]
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.

The vote passes with the following 8 binding and 8 non-binding  +1 votes:

+1 Jukka Zitting
+1 Paul Fremantle
+1 Jeremias Maerki (non-binding)
+1 Matthias Wessendorf (non-binding)
+1 Thilo Goetz (non-binding)
+1 Niall Pemberton
+1 Bertrand Delacretaz
+1 Cristian Geisert (non-binding)
+1 Martijn Dashorst
+1 Emmanuel Lecharny (non-binding)
+1 J Aaron Farr
+1 Graig L Russell
+1 Grant Ingersoll (non-binding)
+1 Robert Burrell Donkin
+1 Vincent Siveton (non-binding)
+1 Alexey Petrenko (non-binding)

Thanks for all the votes! I'll proceed to get the project
infrastructure set up. See you on [EMAIL PROTECTED]

PS. Jeremias, please ask to be included in the Incubator PMC now that
you're a mentor.

BR,

Jukka Zitting

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-06 Thread Alexey Petrenko
+1.
BTW, Harmony has native font library. Probably it's worth to cooperate
these efforts.

SY, Alexey

2008/2/1, Jukka Zitting <[EMAIL PROTECTED]>:
> Incubator PMC,
>
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
>
> Here's my +1
>
> BR,
>
> Jukka Zitting
>
> 
>
> = PDFBox =
>
> === Abstract ===
>
> PDFBox is an open source Java PDF library for working with PDF documents.
>
> === Proposal ===
>
> The PDFBox library allows creation of new PDF documents, manipulation
> of existing documents and the ability to extract content from
> documents. PDFBox also includes several command line utilities. Future
> development plans include extending PDFBox with advanced data
> extraction and high level PDF creation functionality.
>
> In addition to PDFBox, this proposal also covers the !FontBox and
> !JempBox companion libraries. !FontBox is a Java font library used to
> obtain low level information from font files. !JempBox is a Java
> library that implements Adobe's XMP specification. All these
> components would be incubated as a single Apache PDFBox podling
> project.
>
> === Background ===
>
> The PDFBox project started in 2002 and was originally written by Ben
> Litchfield in 2002 and currently lives on SourceForge. The initial
> purpose of PDFBox was to extract text content to be indexed by the
> Lucene search engine.  In addition to text extraction the library also
> supports a low level API for PDF creation and manipulation.  In the
> past, several developers have helped develop specific features in
> PDFBox but none have continued once their specific needs where met.
>
> In 2006 discussions began with the FOP team to collaborate on a single
> PDF library within the Apache organization.  New projects have
> expressed interest in advancing the functionality of PDFBox.
>
> Recently, Tika also expressed interest in advancing the content
> extraction capabilities of PDFBox.
>
> The !FontBox and !JempBox libraries have no dependencies to PDFBox,
> but their primary purpose is to support PDFBox and the development
> community is largely overlapping. It makes sense to include all three
> libraries in a single project.
>
> === Rationale ===
>
> The PDF document format is a common format found on internet and
> across industries as a way of sharing documents.  Several Apache
> projects utilize PDF technologies but there is not a single
> independent PDF library within the Apache organization.
>
> The Apache XML Graphics project (FOP/Batik) has a write-only PDF
> library and is in need of PDF parsing functionality. Many features
> overlap those of PDFBox. This is currently a duplication of effort,
> bringing PDFBox into Apache and combining our efforts will result in a
> more robust PDF library that will be able to support many more use
> cases for working with PDF technologies.
>
> !FontBox, FOP and Batik all contain font loading/handling code that
> could likely be merged into a single common library either within the
> PDFBox podling or outside it.
>
> === Initial Goals ===
>
> The initial goals are:
>
>   * Advanced text extraction techniques
>   * Increase community involvement
>   * Cooperation with existing Apache projects such as XML Graphics
>   * Increasing support for PDF document features
>   * Adding a high level API for document creation
>   * Adding a streaming API for document creation
>   * PDF/A creation and validation functionality
>   * Review licensing of both bundled and external dependencies
>   * Manage export control notices for cryptographic features
>   * Figure out how to handle font handling code across !FontBox, FOP, and 
> Batik
>   * Replace !JempBox with Adobe's XMP library
>
> == Current Status ==
>
> === Meritocracy ===
>
> Not all initial committers are familiar with the meritocracy
> principles of Apache.  It is expected that the committers that are not
> will learn the meritocracy rules and they will be followed through the
> life of the project.
>
> === Community ===
>
> PDFBox has existed for several years on SourceForge and has an active
> community and continues to grow each day.  There are hundreds of
> existing projects that utilize the current version of PDFBox.
>
> === Core Developers ===
>
> Ben Litchfield is the main developer on this project although it is
> expected that developers from a variety of existing Apache projects
> will become part of the team.
>
> === Alignment ===
>
> The ability to search PDF documents is a basic requirement for any
> enterprise search solution.  PDFBox 

Re: [VOTE] Accept PDFBox for incubation

2008-02-05 Thread Vincent Siveton
Definitely +1

Vincent

2008/2/1, Jukka Zitting <[EMAIL PROTECTED]>:
> Incubator PMC,
>
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
>
> Here's my +1
>
> BR,
>
> Jukka Zitting
>
> 
>
> = PDFBox =
>
> === Abstract ===
>
> PDFBox is an open source Java PDF library for working with PDF documents.
>
> === Proposal ===
>
> The PDFBox library allows creation of new PDF documents, manipulation
> of existing documents and the ability to extract content from
> documents. PDFBox also includes several command line utilities. Future
> development plans include extending PDFBox with advanced data
> extraction and high level PDF creation functionality.
>
> In addition to PDFBox, this proposal also covers the !FontBox and
> !JempBox companion libraries. !FontBox is a Java font library used to
> obtain low level information from font files. !JempBox is a Java
> library that implements Adobe's XMP specification. All these
> components would be incubated as a single Apache PDFBox podling
> project.
>
> === Background ===
>
> The PDFBox project started in 2002 and was originally written by Ben
> Litchfield in 2002 and currently lives on SourceForge. The initial
> purpose of PDFBox was to extract text content to be indexed by the
> Lucene search engine.  In addition to text extraction the library also
> supports a low level API for PDF creation and manipulation.  In the
> past, several developers have helped develop specific features in
> PDFBox but none have continued once their specific needs where met.
>
> In 2006 discussions began with the FOP team to collaborate on a single
> PDF library within the Apache organization.  New projects have
> expressed interest in advancing the functionality of PDFBox.
>
> Recently, Tika also expressed interest in advancing the content
> extraction capabilities of PDFBox.
>
> The !FontBox and !JempBox libraries have no dependencies to PDFBox,
> but their primary purpose is to support PDFBox and the development
> community is largely overlapping. It makes sense to include all three
> libraries in a single project.
>
> === Rationale ===
>
> The PDF document format is a common format found on internet and
> across industries as a way of sharing documents.  Several Apache
> projects utilize PDF technologies but there is not a single
> independent PDF library within the Apache organization.
>
> The Apache XML Graphics project (FOP/Batik) has a write-only PDF
> library and is in need of PDF parsing functionality. Many features
> overlap those of PDFBox. This is currently a duplication of effort,
> bringing PDFBox into Apache and combining our efforts will result in a
> more robust PDF library that will be able to support many more use
> cases for working with PDF technologies.
>
> !FontBox, FOP and Batik all contain font loading/handling code that
> could likely be merged into a single common library either within the
> PDFBox podling or outside it.
>
> === Initial Goals ===
>
> The initial goals are:
>
>   * Advanced text extraction techniques
>   * Increase community involvement
>   * Cooperation with existing Apache projects such as XML Graphics
>   * Increasing support for PDF document features
>   * Adding a high level API for document creation
>   * Adding a streaming API for document creation
>   * PDF/A creation and validation functionality
>   * Review licensing of both bundled and external dependencies
>   * Manage export control notices for cryptographic features
>   * Figure out how to handle font handling code across !FontBox, FOP, and 
> Batik
>   * Replace !JempBox with Adobe's XMP library
>
> == Current Status ==
>
> === Meritocracy ===
>
> Not all initial committers are familiar with the meritocracy
> principles of Apache.  It is expected that the committers that are not
> will learn the meritocracy rules and they will be followed through the
> life of the project.
>
> === Community ===
>
> PDFBox has existed for several years on SourceForge and has an active
> community and continues to grow each day.  There are hundreds of
> existing projects that utilize the current version of PDFBox.
>
> === Core Developers ===
>
> Ben Litchfield is the main developer on this project although it is
> expected that developers from a variety of existing Apache projects
> will become part of the team.
>
> === Alignment ===
>
> The ability to search PDF documents is a basic requirement for any
> enterprise search solution.  PDFBox provides the basic content that is
> needed for content indexing.  This functio

Re: [VOTE] Accept PDFBox for incubation

2008-02-05 Thread Robert Burrell Donkin
On Feb 1, 2008 2:18 PM, Jukka Zitting <[EMAIL PROTECTED]> wrote:



> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [X] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)

- robert

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-04 Thread Grant Ingersoll

+1

On Feb 1, 2008, at 9:18 AM, Jukka Zitting wrote:


Incubator PMC,

Please vote on accepting the PDFBox project for incubation. The full
PDFBox proposal is available at the end of this message and as a wiki
page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
Maerki, and Niall Pemberton as the mentors.

The vote is open for the next 72 hours and only votes from the
Incubator PMC are binding.

   [ ] +1 Accept PDFBox as a new podling
   [ ] -1 Do not accept the new podling (provide reason, please)

Here's my +1

BR,

Jukka Zitting



= PDFBox =

=== Abstract ===

PDFBox is an open source Java PDF library for working with PDF  
documents.


=== Proposal ===

The PDFBox library allows creation of new PDF documents, manipulation
of existing documents and the ability to extract content from
documents. PDFBox also includes several command line utilities. Future
development plans include extending PDFBox with advanced data
extraction and high level PDF creation functionality.

In addition to PDFBox, this proposal also covers the !FontBox and
!JempBox companion libraries. !FontBox is a Java font library used to
obtain low level information from font files. !JempBox is a Java
library that implements Adobe's XMP specification. All these
components would be incubated as a single Apache PDFBox podling
project.

=== Background ===

The PDFBox project started in 2002 and was originally written by Ben
Litchfield in 2002 and currently lives on SourceForge. The initial
purpose of PDFBox was to extract text content to be indexed by the
Lucene search engine.  In addition to text extraction the library also
supports a low level API for PDF creation and manipulation.  In the
past, several developers have helped develop specific features in
PDFBox but none have continued once their specific needs where met.

In 2006 discussions began with the FOP team to collaborate on a single
PDF library within the Apache organization.  New projects have
expressed interest in advancing the functionality of PDFBox.

Recently, Tika also expressed interest in advancing the content
extraction capabilities of PDFBox.

The !FontBox and !JempBox libraries have no dependencies to PDFBox,
but their primary purpose is to support PDFBox and the development
community is largely overlapping. It makes sense to include all three
libraries in a single project.

=== Rationale ===

The PDF document format is a common format found on internet and
across industries as a way of sharing documents.  Several Apache
projects utilize PDF technologies but there is not a single
independent PDF library within the Apache organization.

The Apache XML Graphics project (FOP/Batik) has a write-only PDF
library and is in need of PDF parsing functionality. Many features
overlap those of PDFBox. This is currently a duplication of effort,
bringing PDFBox into Apache and combining our efforts will result in a
more robust PDF library that will be able to support many more use
cases for working with PDF technologies.

!FontBox, FOP and Batik all contain font loading/handling code that
could likely be merged into a single common library either within the
PDFBox podling or outside it.

=== Initial Goals ===

The initial goals are:

 * Advanced text extraction techniques
 * Increase community involvement
 * Cooperation with existing Apache projects such as XML Graphics
 * Increasing support for PDF document features
 * Adding a high level API for document creation
 * Adding a streaming API for document creation
 * PDF/A creation and validation functionality
 * Review licensing of both bundled and external dependencies
 * Manage export control notices for cryptographic features
 * Figure out how to handle font handling code across !FontBox, FOP,  
and Batik

 * Replace !JempBox with Adobe's XMP library

== Current Status ==

=== Meritocracy ===

Not all initial committers are familiar with the meritocracy
principles of Apache.  It is expected that the committers that are not
will learn the meritocracy rules and they will be followed through the
life of the project.

=== Community ===

PDFBox has existed for several years on SourceForge and has an active
community and continues to grow each day.  There are hundreds of
existing projects that utilize the current version of PDFBox.

=== Core Developers ===

Ben Litchfield is the main developer on this project although it is
expected that developers from a variety of existing Apache projects
will become part of the team.

=== Alignment ===

The ability to search PDF documents is a basic requirement for any
enterprise search solution.  PDFBox provides the basic content that is
needed for content indexing.  This functionality aligns with the those
of Lucene, Nutch, Tika and UIMA and all users of these projects will
benefit from continued development of PDFBox.

PDFBox shares similar font loading and handling needs as FOP and
Batik, and the code in the !F

Re: [VOTE] Accept PDFBox for incubation

2008-02-04 Thread Craig L Russell

+1

Craig

On Feb 1, 2008, at 6:18 AM, Jukka Zitting wrote:


Please vote on accepting the PDFBox project for incubation. The full
PDFBox proposal is available at the end of this message and as a wiki
page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
Maerki, and Niall Pemberton as the mentors.

The vote is open for the next 72 hours and only votes from the
Incubator PMC are binding.

  [ ] +1 Accept PDFBox as a new podling
  [ ] -1 Do not accept the new podling (provide reason, please)


Craig Russell
Architect, Sun Java Enterprise System http://java.sun.com/products/jdo
408 276-5638 mailto:[EMAIL PROTECTED]
P.S. A good JDO? O, Gasp!



smime.p7s
Description: S/MIME cryptographic signature


Re: [VOTE] Accept PDFBox for incubation

2008-02-03 Thread J Aaron Farr
"Jukka Zitting" <[EMAIL PROTECTED]> writes:

> Incubator PMC,
>
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)

  [X] +1 Accept PDFBox as a new podling



-- 
  J Aaron Farr jadetower.com[US] +1 724-964-4515 
馮傑仁  cubiclemuses.com [HK] +852 8123-7905  

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Emmanuel Lecharny


[X] +1 Accept PDFBox as a new  podling

A must have !

--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Martijn Dashorst
+1

On 2/1/08, Jukka Zitting <[EMAIL PROTECTED]> wrote:
>
> Incubator PMC,
>
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
>
> Here's my +1
>
> BR,
>
> Jukka Zitting
>
> 
>
> = PDFBox =
>
> === Abstract ===
>
> PDFBox is an open source Java PDF library for working with PDF documents.
>
> === Proposal ===
>
> The PDFBox library allows creation of new PDF documents, manipulation
> of existing documents and the ability to extract content from
> documents. PDFBox also includes several command line utilities. Future
> development plans include extending PDFBox with advanced data
> extraction and high level PDF creation functionality.
>
> In addition to PDFBox, this proposal also covers the !FontBox and
> !JempBox companion libraries. !FontBox is a Java font library used to
> obtain low level information from font files. !JempBox is a Java
> library that implements Adobe's XMP specification. All these
> components would be incubated as a single Apache PDFBox podling
> project.
>
> === Background ===
>
> The PDFBox project started in 2002 and was originally written by Ben
> Litchfield in 2002 and currently lives on SourceForge. The initial
> purpose of PDFBox was to extract text content to be indexed by the
> Lucene search engine.  In addition to text extraction the library also
> supports a low level API for PDF creation and manipulation.  In the
> past, several developers have helped develop specific features in
> PDFBox but none have continued once their specific needs where met.
>
> In 2006 discussions began with the FOP team to collaborate on a single
> PDF library within the Apache organization.  New projects have
> expressed interest in advancing the functionality of PDFBox.
>
> Recently, Tika also expressed interest in advancing the content
> extraction capabilities of PDFBox.
>
> The !FontBox and !JempBox libraries have no dependencies to PDFBox,
> but their primary purpose is to support PDFBox and the development
> community is largely overlapping. It makes sense to include all three
> libraries in a single project.
>
> === Rationale ===
>
> The PDF document format is a common format found on internet and
> across industries as a way of sharing documents.  Several Apache
> projects utilize PDF technologies but there is not a single
> independent PDF library within the Apache organization.
>
> The Apache XML Graphics project (FOP/Batik) has a write-only PDF
> library and is in need of PDF parsing functionality. Many features
> overlap those of PDFBox. This is currently a duplication of effort,
> bringing PDFBox into Apache and combining our efforts will result in a
> more robust PDF library that will be able to support many more use
> cases for working with PDF technologies.
>
> !FontBox, FOP and Batik all contain font loading/handling code that
> could likely be merged into a single common library either within the
> PDFBox podling or outside it.
>
> === Initial Goals ===
>
> The initial goals are:
>
>   * Advanced text extraction techniques
>   * Increase community involvement
>   * Cooperation with existing Apache projects such as XML Graphics
>   * Increasing support for PDF document features
>   * Adding a high level API for document creation
>   * Adding a streaming API for document creation
>   * PDF/A creation and validation functionality
>   * Review licensing of both bundled and external dependencies
>   * Manage export control notices for cryptographic features
>   * Figure out how to handle font handling code across !FontBox, FOP, and
> Batik
>   * Replace !JempBox with Adobe's XMP library
>
> == Current Status ==
>
> === Meritocracy ===
>
> Not all initial committers are familiar with the meritocracy
> principles of Apache.  It is expected that the committers that are not
> will learn the meritocracy rules and they will be followed through the
> life of the project.
>
> === Community ===
>
> PDFBox has existed for several years on SourceForge and has an active
> community and continues to grow each day.  There are hundreds of
> existing projects that utilize the current version of PDFBox.
>
> === Core Developers ===
>
> Ben Litchfield is the main developer on this project although it is
> expected that developers from a variety of existing Apache projects
> will become part of the team.
>
> === Alignment ===
>
> The ability to search PDF documents is a basic requirement for any
> enterprise search solution.  PDFBox provides the basic content that is
> needed for content indexing.  This functionality align

Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Christian Geisert

Jukka Zitting schrieb:

Incubator PMC,

Please vote on accepting the PDFBox project for incubation. The full
PDFBox proposal is available at the end of this message and as a wiki


+1

Christian

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Bertrand Delacretaz
On Feb 1, 2008 3:18 PM, Jukka Zitting <[EMAIL PROTECTED]> wrote:

> Please vote on accepting the PDFBox project for incubation

+1, I'm glad to see this coming here!

-Bertrand

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Niall Pemberton
+1

Niall

On Feb 1, 2008 2:18 PM, Jukka Zitting <[EMAIL PROTECTED]> wrote:
> Incubator PMC,
>
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
>
> Here's my +1
>
> BR,
>
> Jukka Zitting
>
> 
>
> = PDFBox =
>
> === Abstract ===
>
> PDFBox is an open source Java PDF library for working with PDF documents.
>
> === Proposal ===
>
> The PDFBox library allows creation of new PDF documents, manipulation
> of existing documents and the ability to extract content from
> documents. PDFBox also includes several command line utilities. Future
> development plans include extending PDFBox with advanced data
> extraction and high level PDF creation functionality.
>
> In addition to PDFBox, this proposal also covers the !FontBox and
> !JempBox companion libraries. !FontBox is a Java font library used to
> obtain low level information from font files. !JempBox is a Java
> library that implements Adobe's XMP specification. All these
> components would be incubated as a single Apache PDFBox podling
> project.
>
> === Background ===
>
> The PDFBox project started in 2002 and was originally written by Ben
> Litchfield in 2002 and currently lives on SourceForge. The initial
> purpose of PDFBox was to extract text content to be indexed by the
> Lucene search engine.  In addition to text extraction the library also
> supports a low level API for PDF creation and manipulation.  In the
> past, several developers have helped develop specific features in
> PDFBox but none have continued once their specific needs where met.
>
> In 2006 discussions began with the FOP team to collaborate on a single
> PDF library within the Apache organization.  New projects have
> expressed interest in advancing the functionality of PDFBox.
>
> Recently, Tika also expressed interest in advancing the content
> extraction capabilities of PDFBox.
>
> The !FontBox and !JempBox libraries have no dependencies to PDFBox,
> but their primary purpose is to support PDFBox and the development
> community is largely overlapping. It makes sense to include all three
> libraries in a single project.
>
> === Rationale ===
>
> The PDF document format is a common format found on internet and
> across industries as a way of sharing documents.  Several Apache
> projects utilize PDF technologies but there is not a single
> independent PDF library within the Apache organization.
>
> The Apache XML Graphics project (FOP/Batik) has a write-only PDF
> library and is in need of PDF parsing functionality. Many features
> overlap those of PDFBox. This is currently a duplication of effort,
> bringing PDFBox into Apache and combining our efforts will result in a
> more robust PDF library that will be able to support many more use
> cases for working with PDF technologies.
>
> !FontBox, FOP and Batik all contain font loading/handling code that
> could likely be merged into a single common library either within the
> PDFBox podling or outside it.
>
> === Initial Goals ===
>
> The initial goals are:
>
>   * Advanced text extraction techniques
>   * Increase community involvement
>   * Cooperation with existing Apache projects such as XML Graphics
>   * Increasing support for PDF document features
>   * Adding a high level API for document creation
>   * Adding a streaming API for document creation
>   * PDF/A creation and validation functionality
>   * Review licensing of both bundled and external dependencies
>   * Manage export control notices for cryptographic features
>   * Figure out how to handle font handling code across !FontBox, FOP, and 
> Batik
>   * Replace !JempBox with Adobe's XMP library
>
> == Current Status ==
>
> === Meritocracy ===
>
> Not all initial committers are familiar with the meritocracy
> principles of Apache.  It is expected that the committers that are not
> will learn the meritocracy rules and they will be followed through the
> life of the project.
>
> === Community ===
>
> PDFBox has existed for several years on SourceForge and has an active
> community and continues to grow each day.  There are hundreds of
> existing projects that utilize the current version of PDFBox.
>
> === Core Developers ===
>
> Ben Litchfield is the main developer on this project although it is
> expected that developers from a variety of existing Apache projects
> will become part of the team.
>
> === Alignment ===
>
> The ability to search PDF documents is a basic requirement for any
> enterprise search solution.  PDFBox provides the basic content that is
> needed for content indexing.  This 

Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Thilo Goetz

Jukka Zitting wrote:

Incubator PMC,

Please vote on accepting the PDFBox project for incubation. The full
PDFBox proposal is available at the end of this message and as a wiki
page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
Maerki, and Niall Pemberton as the mentors.

The vote is open for the next 72 hours and only votes from the
Incubator PMC are binding.

[X] +1 Accept PDFBox as a new podling
[ ] -1 Do not accept the new podling (provide reason, please)


+1 (non-binding).  This is a great addition to the growing Apache
text/unstructured stack.

--Thilo

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Matthias Wessendorf
+1

On Feb 1, 2008 3:48 PM, Jeremias Maerki <[EMAIL PROTECTED]> wrote:
> +1 from me, obviously.
>
> On 01.02.2008 15:18:51 Jukka Zitting wrote:
> > Incubator PMC,
> >
> > Please vote on accepting the PDFBox project for incubation. The full
> > PDFBox proposal is available at the end of this message and as a wiki
> > page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> > Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> > Maerki, and Niall Pemberton as the mentors.
> >
> > The vote is open for the next 72 hours and only votes from the
> > Incubator PMC are binding.
> >
> > [X] +1 Accept PDFBox as a new podling
> > [ ] -1 Do not accept the new podling (provide reason, please)
> >
> > Here's my +1
> >
> > BR,
> >
> > Jukka Zitting
> 
>
>
>
> Jeremias Maerki
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>



-- 
Matthias Wessendorf

further stuff:
blog: http://matthiaswessendorf.wordpress.com/
sessions: http://www.slideshare.net/mwessendorf
mail: matzew-at-apache-dot-org

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Jeremias Maerki
+1 from me, obviously.

On 01.02.2008 15:18:51 Jukka Zitting wrote:
> Incubator PMC,
> 
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
> 
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
> 
> [X] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
> 
> Here's my +1
> 
> BR,
> 
> Jukka Zitting




Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [VOTE] Accept PDFBox for incubation

2008-02-01 Thread Paul Fremantle
Jukka

Seems like a clear proposal with all the right thinking behind it. +1.

Paul

On Feb 1, 2008 2:18 PM, Jukka Zitting <[EMAIL PROTECTED]> wrote:
> Incubator PMC,
>
> Please vote on accepting the PDFBox project for incubation. The full
> PDFBox proposal is available at the end of this message and as a wiki
> page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
> Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
> Maerki, and Niall Pemberton as the mentors.
>
> The vote is open for the next 72 hours and only votes from the
> Incubator PMC are binding.
>
> [ ] +1 Accept PDFBox as a new podling
> [ ] -1 Do not accept the new podling (provide reason, please)
>
> Here's my +1
>
> BR,
>
> Jukka Zitting
>
> 
>
> = PDFBox =
>
> === Abstract ===
>
> PDFBox is an open source Java PDF library for working with PDF documents.
>
> === Proposal ===
>
> The PDFBox library allows creation of new PDF documents, manipulation
> of existing documents and the ability to extract content from
> documents. PDFBox also includes several command line utilities. Future
> development plans include extending PDFBox with advanced data
> extraction and high level PDF creation functionality.
>
> In addition to PDFBox, this proposal also covers the !FontBox and
> !JempBox companion libraries. !FontBox is a Java font library used to
> obtain low level information from font files. !JempBox is a Java
> library that implements Adobe's XMP specification. All these
> components would be incubated as a single Apache PDFBox podling
> project.
>
> === Background ===
>
> The PDFBox project started in 2002 and was originally written by Ben
> Litchfield in 2002 and currently lives on SourceForge. The initial
> purpose of PDFBox was to extract text content to be indexed by the
> Lucene search engine.  In addition to text extraction the library also
> supports a low level API for PDF creation and manipulation.  In the
> past, several developers have helped develop specific features in
> PDFBox but none have continued once their specific needs where met.
>
> In 2006 discussions began with the FOP team to collaborate on a single
> PDF library within the Apache organization.  New projects have
> expressed interest in advancing the functionality of PDFBox.
>
> Recently, Tika also expressed interest in advancing the content
> extraction capabilities of PDFBox.
>
> The !FontBox and !JempBox libraries have no dependencies to PDFBox,
> but their primary purpose is to support PDFBox and the development
> community is largely overlapping. It makes sense to include all three
> libraries in a single project.
>
> === Rationale ===
>
> The PDF document format is a common format found on internet and
> across industries as a way of sharing documents.  Several Apache
> projects utilize PDF technologies but there is not a single
> independent PDF library within the Apache organization.
>
> The Apache XML Graphics project (FOP/Batik) has a write-only PDF
> library and is in need of PDF parsing functionality. Many features
> overlap those of PDFBox. This is currently a duplication of effort,
> bringing PDFBox into Apache and combining our efforts will result in a
> more robust PDF library that will be able to support many more use
> cases for working with PDF technologies.
>
> !FontBox, FOP and Batik all contain font loading/handling code that
> could likely be merged into a single common library either within the
> PDFBox podling or outside it.
>
> === Initial Goals ===
>
> The initial goals are:
>
>   * Advanced text extraction techniques
>   * Increase community involvement
>   * Cooperation with existing Apache projects such as XML Graphics
>   * Increasing support for PDF document features
>   * Adding a high level API for document creation
>   * Adding a streaming API for document creation
>   * PDF/A creation and validation functionality
>   * Review licensing of both bundled and external dependencies
>   * Manage export control notices for cryptographic features
>   * Figure out how to handle font handling code across !FontBox, FOP, and 
> Batik
>   * Replace !JempBox with Adobe's XMP library
>
> == Current Status ==
>
> === Meritocracy ===
>
> Not all initial committers are familiar with the meritocracy
> principles of Apache.  It is expected that the committers that are not
> will learn the meritocracy rules and they will be followed through the
> life of the project.
>
> === Community ===
>
> PDFBox has existed for several years on SourceForge and has an active
> community and continues to grow each day.  There are hundreds of
> existing projects that utilize the current version of PDFBox.
>
> === Core Developers ===
>
> Ben Litchfield is the main developer on this project although it is
> expected that developers from a variety of existing Apache projects
> will become part of the team.
>
> === Alignment ===
>
> The ability to search PDF documents is a basic requirement for any
> enterprise search solution.  PDFBo

[VOTE] Accept PDFBox for incubation

2008-02-01 Thread Jukka Zitting
Incubator PMC,

Please vote on accepting the PDFBox project for incubation. The full
PDFBox proposal is available at the end of this message and as a wiki
page at http://wiki.apache.org/incubator/PDFBoxProposal. We ask the
Incubator PMC to sponsor the PDFBox podling, with myself, Jeremias
Maerki, and Niall Pemberton as the mentors.

The vote is open for the next 72 hours and only votes from the
Incubator PMC are binding.

[ ] +1 Accept PDFBox as a new podling
[ ] -1 Do not accept the new podling (provide reason, please)

Here's my +1

BR,

Jukka Zitting



= PDFBox =

=== Abstract ===

PDFBox is an open source Java PDF library for working with PDF documents.

=== Proposal ===

The PDFBox library allows creation of new PDF documents, manipulation
of existing documents and the ability to extract content from
documents. PDFBox also includes several command line utilities. Future
development plans include extending PDFBox with advanced data
extraction and high level PDF creation functionality.

In addition to PDFBox, this proposal also covers the !FontBox and
!JempBox companion libraries. !FontBox is a Java font library used to
obtain low level information from font files. !JempBox is a Java
library that implements Adobe's XMP specification. All these
components would be incubated as a single Apache PDFBox podling
project.

=== Background ===

The PDFBox project started in 2002 and was originally written by Ben
Litchfield in 2002 and currently lives on SourceForge. The initial
purpose of PDFBox was to extract text content to be indexed by the
Lucene search engine.  In addition to text extraction the library also
supports a low level API for PDF creation and manipulation.  In the
past, several developers have helped develop specific features in
PDFBox but none have continued once their specific needs where met.

In 2006 discussions began with the FOP team to collaborate on a single
PDF library within the Apache organization.  New projects have
expressed interest in advancing the functionality of PDFBox.

Recently, Tika also expressed interest in advancing the content
extraction capabilities of PDFBox.

The !FontBox and !JempBox libraries have no dependencies to PDFBox,
but their primary purpose is to support PDFBox and the development
community is largely overlapping. It makes sense to include all three
libraries in a single project.

=== Rationale ===

The PDF document format is a common format found on internet and
across industries as a way of sharing documents.  Several Apache
projects utilize PDF technologies but there is not a single
independent PDF library within the Apache organization.

The Apache XML Graphics project (FOP/Batik) has a write-only PDF
library and is in need of PDF parsing functionality. Many features
overlap those of PDFBox. This is currently a duplication of effort,
bringing PDFBox into Apache and combining our efforts will result in a
more robust PDF library that will be able to support many more use
cases for working with PDF technologies.

!FontBox, FOP and Batik all contain font loading/handling code that
could likely be merged into a single common library either within the
PDFBox podling or outside it.

=== Initial Goals ===

The initial goals are:

  * Advanced text extraction techniques
  * Increase community involvement
  * Cooperation with existing Apache projects such as XML Graphics
  * Increasing support for PDF document features
  * Adding a high level API for document creation
  * Adding a streaming API for document creation
  * PDF/A creation and validation functionality
  * Review licensing of both bundled and external dependencies
  * Manage export control notices for cryptographic features
  * Figure out how to handle font handling code across !FontBox, FOP, and Batik
  * Replace !JempBox with Adobe's XMP library

== Current Status ==

=== Meritocracy ===

Not all initial committers are familiar with the meritocracy
principles of Apache.  It is expected that the committers that are not
will learn the meritocracy rules and they will be followed through the
life of the project.

=== Community ===

PDFBox has existed for several years on SourceForge and has an active
community and continues to grow each day.  There are hundreds of
existing projects that utilize the current version of PDFBox.

=== Core Developers ===

Ben Litchfield is the main developer on this project although it is
expected that developers from a variety of existing Apache projects
will become part of the team.

=== Alignment ===

The ability to search PDF documents is a basic requirement for any
enterprise search solution.  PDFBox provides the basic content that is
needed for content indexing.  This functionality aligns with the those
of Lucene, Nutch, Tika and UIMA and all users of these projects will
benefit from continued development of PDFBox.

PDFBox shares similar font loading and handling needs as FOP and
Batik, and the code in the !FontBox companion library could well be
merged wit