Hi Jeremias,
Sorry for the confusion. Ken did most of the patch and then had to
work on some other projects, so I did some final touches and
submitted it. We already have a CCLA on file.
thanks,
brian
On Mar 2, 2009, at 3:30 AM, Jeremias Maerki wrote:
Thanks for speaking up, Ken. It's a great thing you're contributing to
PDFBox. But we actually do have legal issues to worry about here.
The way this happened, we don't have a legal trail to make sure that
your contributions are actually intended for inclusion and under what
license. Only Brian (hopefully) knows your intentions. When you
attach a
patch to a Jira issue, you have to tick a checkbox indicating that you
intend this for inclusion:
"[ ] Grant license to ASF for inclusion in ASF works (as per the
Apache
License §5)
Contributions intended for inclusion in ASF products (eg. patches,
code)
must be licensed to ASF under the terms of the Apache License. Other
attachments (eg. log dumps, test cases) need not be."
With §5 of the ALv2 you explicitely give the ASF the same license for
your changes as the ASF gives to its users. That is enough for smaller
patches (bugfixes, small improvements). As soon as you contribute
considerable new functionality or new files which have a certain
"artistic" aspect, the §5 is considered insufficient at which point
committers are expected to ask for an Contributor License Agreement to
be filed with the ASF. Also, regular contributors should send in a CLA
as it is also a precondition to becoming a committer. For even larger
contributions (like whole new subsystems), a contribution may even
have
to go through IP clearance with an explicit separate license grant on
the code submitted. So there are various levels. The lines are
probably
not always very clearly drawn. But the intent is to protect the users
and the contributors (i.e. you) from legal harm [1]. That can only
happen if we have a clean legal trail.
[1] http://apache.org/foundation/how-it-works.html#what
(see especially the third point in the list)
I only notice after this started that you and Justin LeFebvre are from
the same company. Both of you have written more than one patch. So I
would like to suggest that both of you send in an ICLA [2]. Please
also
check if the work contracts in your company make it necessary to
send in
a CCLA [2] in addition to the ICLAs.
[2] http://apache.org/licenses/#clas
A committer can always ask the PMC chair or an ASF member to check
if a
particular ICLA has been recorded, yet.
Ken, can I ask you to attach the two (original) patches, that were
processed via Brian, to the JIRA issues associated with them so the
gaps
are filled, even if that happens after the two patches were
processed. I
think that should be enough to correct the situation. In the future,
please attach your patches to a new JIRA issues and take it from
there.
There are other points also: by directly working with Brian, there
is no
discussion (if necessary) around this if anyone has any issues. Other
committers can only react after everything has already happened.
You're
also not taking part in the community whose building is the most
important task of PDFBox being in the Apache Incubator. And you're not
getting the same visibility you'd get if you take part in discussions
here. Only that way does the existing team have a chance to get to
know
you and to eventually vote you in as a committer if you turn out to
be a
regular contributor. Given that two employees of your company
contribute
to PDFBox means that it is important to you. Then it is all the more
important that you participate in the project and jointly help evolve
the project in directions that help you.
Everybody (especially Brian), don't feel bad about this! The
Incubation
phase is here for everybody to learn who we do things inside the
Apache
Software Foundation. There are a few rules that makes the ASF so
different from the ordinary SourceForge project. I know it's a lot of
new stuff especially new committers have to learn. Hopefully, we
mentors
can help clear things up if there are questions or problems.
Thank you for your understanding!
On 01.03.2009 19:31:14 Ken Glidden wrote:
I am said Ken Glidden.
I'm VP of Engineering at Basis Technology and am working directly
with Brian on this.
No legal issues to worry about.
Cheers.
-----Original Message-----
From: Jeremias Maerki [mailto:[email protected]]
Sent: Saturday, February 28, 2009 12:26 PM
To: [email protected]
Subject: Re: [jira] Resolved: (PDFBOX-430) Incorrect diacritic
placement in text extraction
Brian,
you state here that you've applied a patch by one Ken Glidden. I
cannot
find any post or submission from a person with that name on the
PDFBox
mailing lists. So I'm concerned about the legal trail here. Can you
explain that, please? Thank you.
On 18.02.2009 22:36:01 Brian Carrier (JIRA) wrote:
[ https://issues.apache.org/jira/browse/PDFBOX-430?
page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Brian Carrier resolved PDFBOX-430.
----------------------------------
Resolution: Fixed
Fixed with patch by Ken Glidden that merges a single diacritic
text chunk into the previous text chunk if they overlap. Note
that this will not solve problems where the diacritic comes much
after the text chunk it overlays, but we have not observed PDF
files like that.
Sending trunk/src/main/java/org/apache/pdfbox/util/
PDFTextStripper.java
Sending trunk/src/main/java/org/apache/pdfbox/util/
TextPosition.java
Sending trunk/test/input/Acrobat9.pdf-sorted.txt
Sending trunk/test/input/Acrobat9.pdf.txt
Transmitting file data ....Committed revision 745665.
Incorrect diacritic placement in text extraction
------------------------------------------------
Key: PDFBOX-430
URL: https://issues.apache.org/jira/browse/
PDFBOX-430
Project: PDFBox
Issue Type: Bug
Reporter: Brian Carrier
Some PDF files store diacritics (accents over characters) as
separate text elements. The PDF files essentially have a chunk
of text and then backup and place the diacritic over one of the
characters in the chunk of text. With text extraction, the
current design does not allow the diacritic to be placed over a
character in the chunk and instead it is placed after the chunk.
The debug-diac2.pdf file in PDFBOX-429 shows this problem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Jeremias Maerki
Jeremias Maerki