Re: Google Summer of Code 2009

Alexei Fedotov Sun, 01 Mar 2009 06:24:18 -0800

Mark,
I like the task about jars.

I have a hint for a student who wants to approach it. Harmony jar
reading code has numerous limitations and assumptions (e.g. Harmony
limits a size of a jar file). It is important to keep most of
limitations as is, resisting a desire to eliminate them all at once.
Otherwise instead of performance gain one may face that popular
applications slow down.


Thanks.


On Sun, Mar 1, 2009 at 5:05 PM, Mark Hindess
<[email protected]> wrote:
>
> In message <[email protected]>, Mark Hindess writes:
>>
>>
>> In message <[email protected]>,
>> Sian January writes:
>> >
>> > Hi everyone,
>> >
>> > Do we want to propose any projects for Google Summer of Code 2009?  It
>> > was quite successful last year for Harmony, with two students
>> > completing the programme, so definitely worth doing in my opinion.
>> >
>> > http://code.google.com/soc/
>> >
>> > Thanks,
>> > Sian
>>
>> I've a couple of items on my todo list that might make an interesting
>> GSoC project.  While looking at file descriptor usage between Harmony
>> and RI I noticed that the RI typically reads jar files with an
>> open/mmap/close sequence and then uses the mapped memory to access the
>> file.  Harmony uses open and uses seek/read to access the file.  There
>> are a couple of issues here:
>>
>>   * some applications that use lots of jar files will not work on Harmony
>>     because they will run out of file descriptors even though they will
>>     work on the RI
>
> I notice while looking a the strace from the latest "trival" test case
> in the "Problems with NIO" thread that on the RI the client connect
> socket is always fd=4 where as on DRLVM it is fd=110 so the difference
> is quite significant.  This got me wondering what the difference would
> be when running something like Eclipse with lots of plugin jars.  Just
> loading a fairly trivial workspace on Sun and DRLVM results in using
> 586 and 674 file descriptors respectively.  So it looks like not all
> jars are loaded using the mmap trick but DRLVM would still run out of
> descriptors roughly 100 sooner than the RI.
>
> -Mark
>
>>   * code with memory access rather than seek/read will be a lots simpler
>>     to read/maintain
>>
>>   * what are the performance implications?
>>
>> I'd quite like to investigate this but don't seem to be finding the time.
>>
>> It might also be interesting to explore the possibility of exploiting
>> parallelism (compare gzip/pigz).
>>
>> It might also be worth seeing if there is any performance benefit to using
>> the inflateBack api (compare gzip/gun - gun is in the zlib source examples
>> directory).
>>
>> If people think these ideas are concrete enough to explore then I'll add
>> an item to the wiki.
>>
>> Regards,
>>  Mark.
>>
>
>
>



-- 
С уважением,
Алексей Федотов,
http://people.apache.org/~aaf/

Re: Google Summer of Code 2009

Reply via email to