I think the idea needs to be considerably more exciting to attract students - 
nobody want’s to fix the bugs that even we don’t want to fix!

There are some interesting users of PDFBox, see 
http://pdfliberation.wordpress.com/ for some possible ideas… lots of people 
using OCR there too.

> PDFBOX-1594 Add support for AES256 Encryption 

Seems like a reasonable project.

-- John

On 29 Jan 2014, at 17:28, Fred Hansen <zweibie...@yahoo.com> wrote:

> 
> IMHO a task for GSoC should be non-critical, localized, and not a user 
> interface. A "non-critical" is one where PDFBOX development can continue 
> without relying on the project result. A "localized" project is one that can 
> be incorporated into the code base with few changes to the base. This will 
> limit the effort required to learn about the system into which the effort 
> will fit. A "user-interface" implements an interactive window or an API. I 
> have low expectations of the capabilities of students for doing good designs 
> in these areas.
> 
> So I looked through JIRA for open projects meeting the above.  Since I am not 
> all that familiar with PDFBOX, some of my suggestions may be laughable and 
> surely I have missed some. Nonetheless, here's what I found:
> 
> 
> PDFBOX-553 writing pdf file in Japanese, garbled 
> PDFBOX-570 Windings font recognition + spacing issue 
> PDFBOX-605 Better support for Type0 fonts 
> PDFBOX-678  Support missing Text Rendering Modes when rendering a PDF
> PDFBOX-870 PDF-To-IMAGE output is not anti-aliased 
> PDFBOX-1094 Pattern colorspace support 
> PDFBOX-1594 Add support for AES256 Encryption 
>         (see also PDFBOX-1450 document how to encrypt with AES 256 )
> PDFBOX-1734 ImageIoUtil.WriteImage doesn't work with tiff images
> PDFBOX-1843 Find a way to test PDFToImage 
> 
> 
> 
> 
>> ________________________________
>> From: John Hewson <j...@jahewson.com>
>> To: "dev@pdfbox.apache.org" <dev@pdfbox.apache.org> 
>> Sent: Wednesday, January 29, 2014 6:38 PM
>> Subject: Re: [DISCUSS] GSoC Participation
>> 
>> 
>>> - an idea which came up some years ago, was to implement a gui-interface to
>>> bundle some/all/future tools/features of pdfbox, like printing, rendering,
>>> preflight, split, merge etc.
>> 
>> The AWT/Swing PDF viewer could do with rewriting. But does anyone want that? 
>> Maybe support for JavaFX?
>> 
>>> - a high-level api to create pdfs
>> 
>> I've been thinking about this recently and have come to the conclusion that 
>> it's really hard to do well.
>> 
>>> - an advanced text extractor with table/column support
>> 
>> The table stuff sounds a lot like Tabula? Do we really not have column 
>> support? We need that!
>> 
>> I'll throw in some ideas too:
>> 
>> - an interface for OCR engines to plug into the text extraction API. It 
>> could provide access to extracted images or allow badly encoded fonts to be 
>> passed to OCR one character or text run at a time.
>> 
>> - 
>> 
>> -- John
>> 
>> 
>>> On 29 Jan 2014, at 03:20, Andreas Lehmkühler <andr...@lehmi.de> wrote:
>>> 
>>> Hi,
>>> 
>>>> Maruan Sahyoun <sahy...@fileaffairs.de> hat am 29. Januar 2014 um 10:44
>>>> geschrieben:
>>>> 
>>>> 
>>>> Hi
>>>> 
>>>> shall we try to participate at GSoC? Needs a mentor though.
>>> That idea already came up from time to time and it didn't work for different
>>> reasons.
>>> 
>>> So, to participate we need a mentor and or course at least one good idea to 
>>> pe
>>> proposed.
>>> 
>>> I won't act as mentor for different reasons but I'll try to help in the 
>>> normal
>>> manner.
>>> 
>>> IMO an appropriate idea shall not deal with pdf-specific low-level features,
>>> like linearization support, as I doubt that any possible student is familiar
>>> with the pdf-spec.
>>> 
>>> So possible ideas could be:
>>> 
>>> - an idea which came up some years ago, was to implement a gui-interface to
>>> bundle some/all/future tools/features of pdfbox, like printing, rendering,
>>> preflight, split, merge etc.
>>> - a high-level api to create pdfs
>>> - an advanced text extractor with table/column support
>>> 
>>> 
>>>> BR
>>>> 
>>>> Maruan Sahyoun
>>> 
>>> BR
>>> Andreas Lehmkühler
>> 

Reply via email to