Re: Apply for mentor role in ASF for GSC

2015-03-10 Thread Ulrich Stärk
Please refer to my emails detailing the process which I sent to all PMCs.

Thanks,

Uli

On 2015-03-09 19:35, Dingcheng Li wrote:
 Dear ASF administrator, I am a research scientist and a senior software
 developer at Mayo Clinic medical informatics group. I am also a PMC member
 for cTAKES which is an open-source software developed by our group and
 released via ASF. I am quite interested in GSC and would like to serve as a
 mentor. I hope that more good software can be developed by young students
 and contribute the repository of ASF. Hope that I can be accepted by you as
 a mentor. Thanks, Dingcheng
 


Disk space requirement for building on Windows (was: Re: [jira] [Commented] (JENA-897) jena-jdbc-tdb tests use %TEMP% instead of target/)

2015-03-10 Thread Stian Soiland-Reyes
 Thanks, Rob!

I tried looking yesterday at ways to reduce the disk space
requirements when building on Windows - including truncating the files
after closing. This seems to require deep changes into TDBs
ChannelManager which keeps the corresponding FileChannels - perhaps a
new method for that purpose?

https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/com/hp/hpl/jena/tdb/base/file/ChannelManager.java


It seems on Windows with Oracle/OpenJDK you can call System.gc() to
(hopefully) release the ByteBuffers that lock the memory regions (and
then making the files deletable) - but this adds a significant
overhead. The dispose methods on ByteBufferImpls are not easily
accessible - you would need some introspection hackery to get hold of
that cleaner() and that would of course only work on Oracle/OpenJDK.
as fc.map() still does the same thing.

Close your eyes -  GPL3!
https://github.com/stain/jdk8u/blob/master/src/share/classes/java/nio/Direct-X-Buffer.java.template#L72
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/nio/DirectByteBuffer.java/#72


I tried using the FileChannels from JDK7 NIO2 (e.g.
FileChannel.open(Path)) instead of through RandomAccessFile - but it
did not make any difference


Perhaps System.gc() is not worth it in general (* on Windows) when
closing a dataset - I tried to modify the ChannelManager to always do
this on release, it meant each test in jena-jdbc-tdb took 1.5s instead
of 0.2s, but it did allow me to delete the used folders from target/
while the JVM/test was running.

For the tests we could do something like for every 10 tests do
System.gc() and wipe the old data.

Perhaps Fuseki 2 could do System.gc() on [Remove]  SystemTDB.isWindows.



On 10 March 2015 at 10:00, ASF GitHub Bot (JIRA) j...@apache.org wrote:

 [ 
 https://issues.apache.org/jira/browse/JENA-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354617#comment-14354617
  ]

 ASF GitHub Bot commented on JENA-897:
 -

 Github user asfgit closed the pull request at:

 https://github.com/apache/jena/pull/41


 jena-jdbc-tdb tests use %TEMP% instead of target/
 -

 Key: JENA-897
 URL: https://issues.apache.org/jira/browse/JENA-897
 Project: Apache Jena
  Issue Type: Bug
  Components: JDBC
Affects Versions: Jena 2.12.1, Jena 2.13.0
 Environment: Windowx 8.0 x64, C: with 34 GB free
Reporter: Stian Soiland-Reyes
Priority: Critical
 Fix For: Jena 2.13.1


 .. and thus mvn clean install on Windows will easily consume 37 GB on C: and 
 run out of disk space - even if Jena is built on a larger partition.



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)



-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/-0001-9842-9718


Re: Disk space requirement for building on Windows

2015-03-10 Thread Stian Soiland-Reyes
The System.gc() is not guaranteed to work - but it did work for me.
That is what I meant by a 'best effort' within the test. The test does
not require all those old test folders to hang around.

Could we do a workaround for the test to set say a tinier block-size
instead of 8 MB?


As I have been unable to build all of Jena on Windows - what is the
actual disk space requirement? It should at least be documented in the
README.

On 10 March 2015 at 11:07, Rob Vesse rve...@dotnetrdf.org wrote:
 Using an alternative approach would not make any difference

 It is a fundamental bug in Windows memory mapped files that means that a
 JVM can never guarantee to completely release memory mapped files while
 the JVM is alive.

 Andy has posted this many times on threads about TDB on Windows in the
 past.  No workaround we could attempt could ever solve the issue on
 Windows so there is really no point in expending effort changing something
 low level that otherwise works fine across multiple platforms.

 Rob

 On 10/03/2015 10:25, Stian Soiland-Reyes st...@apache.org wrote:

 Thanks, Rob!

I tried looking yesterday at ways to reduce the disk space
requirements when building on Windows - including truncating the files
after closing. This seems to require deep changes into TDBs
ChannelManager which keeps the corresponding FileChannels - perhaps a
new method for that purpose?

https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/com/hp/h
pl/jena/tdb/base/file/ChannelManager.java


It seems on Windows with Oracle/OpenJDK you can call System.gc() to
(hopefully) release the ByteBuffers that lock the memory regions (and
then making the files deletable) - but this adds a significant
overhead. The dispose methods on ByteBufferImpls are not easily
accessible - you would need some introspection hackery to get hold of
that cleaner() and that would of course only work on Oracle/OpenJDK.
as fc.map() still does the same thing.

Close your eyes -  GPL3!
https://github.com/stain/jdk8u/blob/master/src/share/classes/java/nio/Dire
ct-X-Buffer.java.template#L72
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b
132/java/nio/DirectByteBuffer.java/#72


I tried using the FileChannels from JDK7 NIO2 (e.g.
FileChannel.open(Path)) instead of through RandomAccessFile - but it
did not make any difference


Perhaps System.gc() is not worth it in general (* on Windows) when
closing a dataset - I tried to modify the ChannelManager to always do
this on release, it meant each test in jena-jdbc-tdb took 1.5s instead
of 0.2s, but it did allow me to delete the used folders from target/
while the JVM/test was running.

For the tests we could do something like for every 10 tests do
System.gc() and wipe the old data.

Perhaps Fuseki 2 could do System.gc() on [Remove]  SystemTDB.isWindows.



On 10 March 2015 at 10:00, ASF GitHub Bot (JIRA) j...@apache.org wrote:

 [
https://issues.apache.org/jira/browse/JENA-897?page=com.atlassian.jira.pl
ugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354617#com
ment-14354617 ]

 ASF GitHub Bot commented on JENA-897:
 -

 Github user asfgit closed the pull request at:

 https://github.com/apache/jena/pull/41


 jena-jdbc-tdb tests use %TEMP% instead of target/
 -

 Key: JENA-897
 URL: https://issues.apache.org/jira/browse/JENA-897
 Project: Apache Jena
  Issue Type: Bug
  Components: JDBC
Affects Versions: Jena 2.12.1, Jena 2.13.0
 Environment: Windowx 8.0 x64, C: with 34 GB free
Reporter: Stian Soiland-Reyes
Priority: Critical
 Fix For: Jena 2.13.1


 .. and thus mvn clean install on Windows will easily consume 37 GB on
C: and run out of disk space - even if Jena is built on a larger
partition.



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)



--
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/-0001-9842-9718







-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/-0001-9842-9718


Re: Disk space requirement for building on Windows

2015-03-10 Thread Rob Vesse
Using an alternative approach would not make any difference

It is a fundamental bug in Windows memory mapped files that means that a
JVM can never guarantee to completely release memory mapped files while
the JVM is alive.

Andy has posted this many times on threads about TDB on Windows in the
past.  No workaround we could attempt could ever solve the issue on
Windows so there is really no point in expending effort changing something
low level that otherwise works fine across multiple platforms.

Rob

On 10/03/2015 10:25, Stian Soiland-Reyes st...@apache.org wrote:

 Thanks, Rob!

I tried looking yesterday at ways to reduce the disk space
requirements when building on Windows - including truncating the files
after closing. This seems to require deep changes into TDBs
ChannelManager which keeps the corresponding FileChannels - perhaps a
new method for that purpose?

https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/com/hp/h
pl/jena/tdb/base/file/ChannelManager.java


It seems on Windows with Oracle/OpenJDK you can call System.gc() to
(hopefully) release the ByteBuffers that lock the memory regions (and
then making the files deletable) - but this adds a significant
overhead. The dispose methods on ByteBufferImpls are not easily
accessible - you would need some introspection hackery to get hold of
that cleaner() and that would of course only work on Oracle/OpenJDK.
as fc.map() still does the same thing.

Close your eyes -  GPL3!
https://github.com/stain/jdk8u/blob/master/src/share/classes/java/nio/Dire
ct-X-Buffer.java.template#L72
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b
132/java/nio/DirectByteBuffer.java/#72


I tried using the FileChannels from JDK7 NIO2 (e.g.
FileChannel.open(Path)) instead of through RandomAccessFile - but it
did not make any difference


Perhaps System.gc() is not worth it in general (* on Windows) when
closing a dataset - I tried to modify the ChannelManager to always do
this on release, it meant each test in jena-jdbc-tdb took 1.5s instead
of 0.2s, but it did allow me to delete the used folders from target/
while the JVM/test was running.

For the tests we could do something like for every 10 tests do
System.gc() and wipe the old data.

Perhaps Fuseki 2 could do System.gc() on [Remove]  SystemTDB.isWindows.



On 10 March 2015 at 10:00, ASF GitHub Bot (JIRA) j...@apache.org wrote:

 [ 
https://issues.apache.org/jira/browse/JENA-897?page=com.atlassian.jira.pl
ugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354617#com
ment-14354617 ]

 ASF GitHub Bot commented on JENA-897:
 -

 Github user asfgit closed the pull request at:

 https://github.com/apache/jena/pull/41


 jena-jdbc-tdb tests use %TEMP% instead of target/
 -

 Key: JENA-897
 URL: https://issues.apache.org/jira/browse/JENA-897
 Project: Apache Jena
  Issue Type: Bug
  Components: JDBC
Affects Versions: Jena 2.12.1, Jena 2.13.0
 Environment: Windowx 8.0 x64, C: with 34 GB free
Reporter: Stian Soiland-Reyes
Priority: Critical
 Fix For: Jena 2.13.1


 .. and thus mvn clean install on Windows will easily consume 37 GB on
C: and run out of disk space - even if Jena is built on a larger
partition.



 --
 This message was sent by Atlassian JIRA
 (v6.3.4#6332)



-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons RDF (incubating)
http://orcid.org/-0001-9842-9718






Re: Chairs: A small addition to the Marvin email you received yesterday.

2015-03-10 Thread jan i
Hi.

The tool is real awesome, but is there a reason to keep it so secret.

I would like to share the nice mail graphs with my fellow committers, and I
cannot really see any secret in the reports.

My suggestion is, keep it as it is for people who do a login (that is a big
help), but allow a non-login version with (nearly) the same information.

I find tools like this awesome, but do not understand why we try to make a
lot of this for members or PMCs only, Apache is also about transparency,
and this tool only collect information (except for the private lists), that
can be found publicly.


rgds
jan I.


On 10 March 2015 at 17:42, Kevin A. McGrail kevin.mcgr...@mcgrail.com
wrote:

 This is AWESOME!


 On 3/5/2015 9:31 AM, Daniel Gruno wrote:

 Hi Project chairs,
 In yesterday's email to you about your upcoming board report, we forgot
 to mention that we have a new tool that can help you in cobbling together a
 report, or just view statistics of the PMCs you are on.

 The new service is located at: https://reporter.apache.org and is PMC
 members only.
 Should you choose to make use of the board report template in this
 system, do remember to add in the important activity bits and any issues
 that require board activity.

 Next time Marvin sends you an email, it will include the URL for the
 reporter system.

 If you have ANY feedback about this system, don't hesitate to let us
 know! :)

 On behalf of the Community Development Project,
 Daniel.





Re: Chairs: A small addition to the Marvin email you received yesterday.

2015-03-10 Thread Daniel Gruno

Hi Jan,
It's not a secret - it was disclosed on a public mailing list (this one) :).
Currently a login is required to be able to compile the list of projects 
you are affiliated with, and it is kept pmcs only because it only makes 
sense for PMC members. Seccondly, it is not geared for anonymous 
viewing simply because the data compilation does not scale well. It can 
handle some 5,000 people knowing about it, but not 5,000,000 people :).
Thirdly, this is a tool for generating a board report, not an activity 
monitor meant for the public. I wrote it to serve the PMCs, and the PMCs 
may or may not choose to have mailing list data publicly available, that 
is not for me to decide.


What we _could_ do publicly is tie some of the gathered information into 
the new projects.apache.org site, and provide it through there 
(committership/PMC changes for instance as well as release data), but I 
am at the edge of what I'm willing to single-handedly do for that 
project (it was never meant to be a one-man project), so I'll need 
someone else to step in and collaborate with me on that.


With regards,
Daniel.

On 2015-03-10 18:04, jan i wrote:

Hi.

The tool is real awesome, but is there a reason to keep it so secret.

I would like to share the nice mail graphs with my fellow committers, and I
cannot really see any secret in the reports.

My suggestion is, keep it as it is for people who do a login (that is a big
help), but allow a non-login version with (nearly) the same information.

I find tools like this awesome, but do not understand why we try to make a
lot of this for members or PMCs only, Apache is also about transparency,
and this tool only collect information (except for the private lists), that
can be found publicly.


rgds
jan I.


On 10 March 2015 at 17:42, Kevin A. McGrail kevin.mcgr...@mcgrail.com
wrote:


This is AWESOME!


On 3/5/2015 9:31 AM, Daniel Gruno wrote:


Hi Project chairs,
In yesterday's email to you about your upcoming board report, we forgot
to mention that we have a new tool that can help you in cobbling together a
report, or just view statistics of the PMCs you are on.

The new service is located at: https://reporter.apache.org and is PMC
members only.
Should you choose to make use of the board report template in this
system, do remember to add in the important activity bits and any issues
that require board activity.

Next time Marvin sends you an email, it will include the URL for the
reporter system.

If you have ANY feedback about this system, don't hesitate to let us
know! :)

On behalf of the Community Development Project,
Daniel.







Re: Chairs: A small addition to the Marvin email you received yesterday.

2015-03-10 Thread jan i
On 10 March 2015 at 18:14, Daniel Gruno humbed...@apache.org wrote:

 Hi Jan,
 It's not a secret - it was disclosed on a public mailing list (this one)
 :).
 Currently a login is required to be able to compile the list of projects
 you are affiliated with, and it is kept pmcs only because it only makes
 sense for PMC members. Seccondly, it is not geared for anonymous viewing
 simply because the data compilation does not scale well. It can handle some
 5,000 people knowing about it, but not 5,000,000 people :).
 Thirdly, this is a tool for generating a board report, not an activity
 monitor meant for the public. I wrote it to serve the PMCs, and the PMCs
 may or may not choose to have mailing list data publicly available, that is
 not for me to decide.

OK got it, a very valid reason.



 What we _could_ do publicly is tie some of the gathered information into
 the new projects.apache.org site, and provide it through there
 (committership/PMC changes for instance as well as release data), but I am
 at the edge of what I'm willing to single-handedly do for that project (it
 was never meant to be a one-man project), so I'll need someone else to step
 in and collaborate with me on that.

Sounds like a hint to me. I would like to see the mail graphs public as
they are a a good barometer for the communities.

I my skills is suited, I volunteer to help you, even though it will be in
form of patches since I believe I do not have karma.

Can you give me some pointers offlist (I will come on hipchat later).

rgds
jan I.



 With regards,
 Daniel.


 On 2015-03-10 18:04, jan i wrote:

 Hi.

 The tool is real awesome, but is there a reason to keep it so secret.

 I would like to share the nice mail graphs with my fellow committers, and
 I
 cannot really see any secret in the reports.

 My suggestion is, keep it as it is for people who do a login (that is a
 big
 help), but allow a non-login version with (nearly) the same information.

 I find tools like this awesome, but do not understand why we try to make a
 lot of this for members or PMCs only, Apache is also about transparency,
 and this tool only collect information (except for the private lists),
 that
 can be found publicly.


 rgds
 jan I.


 On 10 March 2015 at 17:42, Kevin A. McGrail kevin.mcgr...@mcgrail.com
 wrote:

  This is AWESOME!


 On 3/5/2015 9:31 AM, Daniel Gruno wrote:

  Hi Project chairs,
 In yesterday's email to you about your upcoming board report, we forgot
 to mention that we have a new tool that can help you in cobbling
 together a
 report, or just view statistics of the PMCs you are on.

 The new service is located at: https://reporter.apache.org and is PMC
 members only.
 Should you choose to make use of the board report template in this
 system, do remember to add in the important activity bits and any issues
 that require board activity.

 Next time Marvin sends you an email, it will include the URL for the
 reporter system.

 If you have ANY feedback about this system, don't hesitate to let us
 know! :)

 On behalf of the Community Development Project,
 Daniel.






Re: Chairs: A small addition to the Marvin email you received yesterday.

2015-03-10 Thread Kevin A. McGrail

This is AWESOME!

On 3/5/2015 9:31 AM, Daniel Gruno wrote:

Hi Project chairs,
In yesterday's email to you about your upcoming board report, we 
forgot to mention that we have a new tool that can help you in 
cobbling together a report, or just view statistics of the PMCs you 
are on.


The new service is located at: https://reporter.apache.org and is PMC 
members only.
Should you choose to make use of the board report template in this 
system, do remember to add in the important activity bits and any 
issues that require board activity.


Next time Marvin sends you an email, it will include the URL for the 
reporter system.


If you have ANY feedback about this system, don't hesitate to let us 
know! :)


On behalf of the Community Development Project,
Daniel.




Re: [VOTE] Replace projects.apache.org with projects-new.apache.org

2015-03-10 Thread Rich Bowen



On 03/06/2015 11:52 AM, Rich Bowen wrote:

I'd like for us to go ahead and replace projects.apache.org with
projects-new.apache.org. It now has all the functionality that
projects.a.o has, and much more, and there's no reason to have two sites
up. If you object to moving forward with this, please say so.

[ ] +1, do it
[ ] +0, whatevs
[ ] -1, No (and say why, so we can address the problem)



I'm going to call this a yes vote overall, with a few nits that have 
been addressed. Thank you all for your thoughts. Thanks, Daniel, for 
your work on this. And with all the folks that have said they'll get 
checkouts and hack on it, we should have much wonderment real soon.


--Rich


--
Rich Bowen - rbo...@rcbowen.com - @rbowen
http://apachecon.com/ - @apachecon


Apply for mentor role in ASF for GSC

2015-03-10 Thread Dingcheng Li
Dear ASF administrator, I am a research scientist and a senior software
developer at Mayo Clinic medical informatics group. I am also a PMC member
for cTAKES which is an open-source software developed by our group and
released via ASF. I am quite interested in GSC and would like to serve as a
mentor. I hope that more good software can be developed by young students
and contribute the repository of ASF. Hope that I can be accepted by you as
a mentor. Thanks, Dingcheng