Re: Apply for mentor role in ASF for GSC
Please refer to my emails detailing the process which I sent to all PMCs. Thanks, Uli On 2015-03-09 19:35, Dingcheng Li wrote: Dear ASF administrator, I am a research scientist and a senior software developer at Mayo Clinic medical informatics group. I am also a PMC member for cTAKES which is an open-source software developed by our group and released via ASF. I am quite interested in GSC and would like to serve as a mentor. I hope that more good software can be developed by young students and contribute the repository of ASF. Hope that I can be accepted by you as a mentor. Thanks, Dingcheng
Disk space requirement for building on Windows (was: Re: [jira] [Commented] (JENA-897) jena-jdbc-tdb tests use %TEMP% instead of target/)
Thanks, Rob! I tried looking yesterday at ways to reduce the disk space requirements when building on Windows - including truncating the files after closing. This seems to require deep changes into TDBs ChannelManager which keeps the corresponding FileChannels - perhaps a new method for that purpose? https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/com/hp/hpl/jena/tdb/base/file/ChannelManager.java It seems on Windows with Oracle/OpenJDK you can call System.gc() to (hopefully) release the ByteBuffers that lock the memory regions (and then making the files deletable) - but this adds a significant overhead. The dispose methods on ByteBufferImpls are not easily accessible - you would need some introspection hackery to get hold of that cleaner() and that would of course only work on Oracle/OpenJDK. as fc.map() still does the same thing. Close your eyes - GPL3! https://github.com/stain/jdk8u/blob/master/src/share/classes/java/nio/Direct-X-Buffer.java.template#L72 http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/nio/DirectByteBuffer.java/#72 I tried using the FileChannels from JDK7 NIO2 (e.g. FileChannel.open(Path)) instead of through RandomAccessFile - but it did not make any difference Perhaps System.gc() is not worth it in general (* on Windows) when closing a dataset - I tried to modify the ChannelManager to always do this on release, it meant each test in jena-jdbc-tdb took 1.5s instead of 0.2s, but it did allow me to delete the used folders from target/ while the JVM/test was running. For the tests we could do something like for every 10 tests do System.gc() and wipe the old data. Perhaps Fuseki 2 could do System.gc() on [Remove] SystemTDB.isWindows. On 10 March 2015 at 10:00, ASF GitHub Bot (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/JENA-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354617#comment-14354617 ] ASF GitHub Bot commented on JENA-897: - Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/41 jena-jdbc-tdb tests use %TEMP% instead of target/ - Key: JENA-897 URL: https://issues.apache.org/jira/browse/JENA-897 Project: Apache Jena Issue Type: Bug Components: JDBC Affects Versions: Jena 2.12.1, Jena 2.13.0 Environment: Windowx 8.0 x64, C: with 34 GB free Reporter: Stian Soiland-Reyes Priority: Critical Fix For: Jena 2.13.1 .. and thus mvn clean install on Windows will easily consume 37 GB on C: and run out of disk space - even if Jena is built on a larger partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332) -- Stian Soiland-Reyes Apache Taverna (incubating), Apache Commons RDF (incubating) http://orcid.org/-0001-9842-9718
Re: Disk space requirement for building on Windows
The System.gc() is not guaranteed to work - but it did work for me. That is what I meant by a 'best effort' within the test. The test does not require all those old test folders to hang around. Could we do a workaround for the test to set say a tinier block-size instead of 8 MB? As I have been unable to build all of Jena on Windows - what is the actual disk space requirement? It should at least be documented in the README. On 10 March 2015 at 11:07, Rob Vesse rve...@dotnetrdf.org wrote: Using an alternative approach would not make any difference It is a fundamental bug in Windows memory mapped files that means that a JVM can never guarantee to completely release memory mapped files while the JVM is alive. Andy has posted this many times on threads about TDB on Windows in the past. No workaround we could attempt could ever solve the issue on Windows so there is really no point in expending effort changing something low level that otherwise works fine across multiple platforms. Rob On 10/03/2015 10:25, Stian Soiland-Reyes st...@apache.org wrote: Thanks, Rob! I tried looking yesterday at ways to reduce the disk space requirements when building on Windows - including truncating the files after closing. This seems to require deep changes into TDBs ChannelManager which keeps the corresponding FileChannels - perhaps a new method for that purpose? https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/com/hp/h pl/jena/tdb/base/file/ChannelManager.java It seems on Windows with Oracle/OpenJDK you can call System.gc() to (hopefully) release the ByteBuffers that lock the memory regions (and then making the files deletable) - but this adds a significant overhead. The dispose methods on ByteBufferImpls are not easily accessible - you would need some introspection hackery to get hold of that cleaner() and that would of course only work on Oracle/OpenJDK. as fc.map() still does the same thing. Close your eyes - GPL3! https://github.com/stain/jdk8u/blob/master/src/share/classes/java/nio/Dire ct-X-Buffer.java.template#L72 http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b 132/java/nio/DirectByteBuffer.java/#72 I tried using the FileChannels from JDK7 NIO2 (e.g. FileChannel.open(Path)) instead of through RandomAccessFile - but it did not make any difference Perhaps System.gc() is not worth it in general (* on Windows) when closing a dataset - I tried to modify the ChannelManager to always do this on release, it meant each test in jena-jdbc-tdb took 1.5s instead of 0.2s, but it did allow me to delete the used folders from target/ while the JVM/test was running. For the tests we could do something like for every 10 tests do System.gc() and wipe the old data. Perhaps Fuseki 2 could do System.gc() on [Remove] SystemTDB.isWindows. On 10 March 2015 at 10:00, ASF GitHub Bot (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/JENA-897?page=com.atlassian.jira.pl ugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354617#com ment-14354617 ] ASF GitHub Bot commented on JENA-897: - Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/41 jena-jdbc-tdb tests use %TEMP% instead of target/ - Key: JENA-897 URL: https://issues.apache.org/jira/browse/JENA-897 Project: Apache Jena Issue Type: Bug Components: JDBC Affects Versions: Jena 2.12.1, Jena 2.13.0 Environment: Windowx 8.0 x64, C: with 34 GB free Reporter: Stian Soiland-Reyes Priority: Critical Fix For: Jena 2.13.1 .. and thus mvn clean install on Windows will easily consume 37 GB on C: and run out of disk space - even if Jena is built on a larger partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332) -- Stian Soiland-Reyes Apache Taverna (incubating), Apache Commons RDF (incubating) http://orcid.org/-0001-9842-9718 -- Stian Soiland-Reyes Apache Taverna (incubating), Apache Commons RDF (incubating) http://orcid.org/-0001-9842-9718
Re: Disk space requirement for building on Windows
Using an alternative approach would not make any difference It is a fundamental bug in Windows memory mapped files that means that a JVM can never guarantee to completely release memory mapped files while the JVM is alive. Andy has posted this many times on threads about TDB on Windows in the past. No workaround we could attempt could ever solve the issue on Windows so there is really no point in expending effort changing something low level that otherwise works fine across multiple platforms. Rob On 10/03/2015 10:25, Stian Soiland-Reyes st...@apache.org wrote: Thanks, Rob! I tried looking yesterday at ways to reduce the disk space requirements when building on Windows - including truncating the files after closing. This seems to require deep changes into TDBs ChannelManager which keeps the corresponding FileChannels - perhaps a new method for that purpose? https://github.com/apache/jena/blob/master/jena-tdb/src/main/java/com/hp/h pl/jena/tdb/base/file/ChannelManager.java It seems on Windows with Oracle/OpenJDK you can call System.gc() to (hopefully) release the ByteBuffers that lock the memory regions (and then making the files deletable) - but this adds a significant overhead. The dispose methods on ByteBufferImpls are not easily accessible - you would need some introspection hackery to get hold of that cleaner() and that would of course only work on Oracle/OpenJDK. as fc.map() still does the same thing. Close your eyes - GPL3! https://github.com/stain/jdk8u/blob/master/src/share/classes/java/nio/Dire ct-X-Buffer.java.template#L72 http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b 132/java/nio/DirectByteBuffer.java/#72 I tried using the FileChannels from JDK7 NIO2 (e.g. FileChannel.open(Path)) instead of through RandomAccessFile - but it did not make any difference Perhaps System.gc() is not worth it in general (* on Windows) when closing a dataset - I tried to modify the ChannelManager to always do this on release, it meant each test in jena-jdbc-tdb took 1.5s instead of 0.2s, but it did allow me to delete the used folders from target/ while the JVM/test was running. For the tests we could do something like for every 10 tests do System.gc() and wipe the old data. Perhaps Fuseki 2 could do System.gc() on [Remove] SystemTDB.isWindows. On 10 March 2015 at 10:00, ASF GitHub Bot (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/JENA-897?page=com.atlassian.jira.pl ugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354617#com ment-14354617 ] ASF GitHub Bot commented on JENA-897: - Github user asfgit closed the pull request at: https://github.com/apache/jena/pull/41 jena-jdbc-tdb tests use %TEMP% instead of target/ - Key: JENA-897 URL: https://issues.apache.org/jira/browse/JENA-897 Project: Apache Jena Issue Type: Bug Components: JDBC Affects Versions: Jena 2.12.1, Jena 2.13.0 Environment: Windowx 8.0 x64, C: with 34 GB free Reporter: Stian Soiland-Reyes Priority: Critical Fix For: Jena 2.13.1 .. and thus mvn clean install on Windows will easily consume 37 GB on C: and run out of disk space - even if Jena is built on a larger partition. -- This message was sent by Atlassian JIRA (v6.3.4#6332) -- Stian Soiland-Reyes Apache Taverna (incubating), Apache Commons RDF (incubating) http://orcid.org/-0001-9842-9718
Re: Chairs: A small addition to the Marvin email you received yesterday.
Hi. The tool is real awesome, but is there a reason to keep it so secret. I would like to share the nice mail graphs with my fellow committers, and I cannot really see any secret in the reports. My suggestion is, keep it as it is for people who do a login (that is a big help), but allow a non-login version with (nearly) the same information. I find tools like this awesome, but do not understand why we try to make a lot of this for members or PMCs only, Apache is also about transparency, and this tool only collect information (except for the private lists), that can be found publicly. rgds jan I. On 10 March 2015 at 17:42, Kevin A. McGrail kevin.mcgr...@mcgrail.com wrote: This is AWESOME! On 3/5/2015 9:31 AM, Daniel Gruno wrote: Hi Project chairs, In yesterday's email to you about your upcoming board report, we forgot to mention that we have a new tool that can help you in cobbling together a report, or just view statistics of the PMCs you are on. The new service is located at: https://reporter.apache.org and is PMC members only. Should you choose to make use of the board report template in this system, do remember to add in the important activity bits and any issues that require board activity. Next time Marvin sends you an email, it will include the URL for the reporter system. If you have ANY feedback about this system, don't hesitate to let us know! :) On behalf of the Community Development Project, Daniel.
Re: Chairs: A small addition to the Marvin email you received yesterday.
Hi Jan, It's not a secret - it was disclosed on a public mailing list (this one) :). Currently a login is required to be able to compile the list of projects you are affiliated with, and it is kept pmcs only because it only makes sense for PMC members. Seccondly, it is not geared for anonymous viewing simply because the data compilation does not scale well. It can handle some 5,000 people knowing about it, but not 5,000,000 people :). Thirdly, this is a tool for generating a board report, not an activity monitor meant for the public. I wrote it to serve the PMCs, and the PMCs may or may not choose to have mailing list data publicly available, that is not for me to decide. What we _could_ do publicly is tie some of the gathered information into the new projects.apache.org site, and provide it through there (committership/PMC changes for instance as well as release data), but I am at the edge of what I'm willing to single-handedly do for that project (it was never meant to be a one-man project), so I'll need someone else to step in and collaborate with me on that. With regards, Daniel. On 2015-03-10 18:04, jan i wrote: Hi. The tool is real awesome, but is there a reason to keep it so secret. I would like to share the nice mail graphs with my fellow committers, and I cannot really see any secret in the reports. My suggestion is, keep it as it is for people who do a login (that is a big help), but allow a non-login version with (nearly) the same information. I find tools like this awesome, but do not understand why we try to make a lot of this for members or PMCs only, Apache is also about transparency, and this tool only collect information (except for the private lists), that can be found publicly. rgds jan I. On 10 March 2015 at 17:42, Kevin A. McGrail kevin.mcgr...@mcgrail.com wrote: This is AWESOME! On 3/5/2015 9:31 AM, Daniel Gruno wrote: Hi Project chairs, In yesterday's email to you about your upcoming board report, we forgot to mention that we have a new tool that can help you in cobbling together a report, or just view statistics of the PMCs you are on. The new service is located at: https://reporter.apache.org and is PMC members only. Should you choose to make use of the board report template in this system, do remember to add in the important activity bits and any issues that require board activity. Next time Marvin sends you an email, it will include the URL for the reporter system. If you have ANY feedback about this system, don't hesitate to let us know! :) On behalf of the Community Development Project, Daniel.
Re: Chairs: A small addition to the Marvin email you received yesterday.
On 10 March 2015 at 18:14, Daniel Gruno humbed...@apache.org wrote: Hi Jan, It's not a secret - it was disclosed on a public mailing list (this one) :). Currently a login is required to be able to compile the list of projects you are affiliated with, and it is kept pmcs only because it only makes sense for PMC members. Seccondly, it is not geared for anonymous viewing simply because the data compilation does not scale well. It can handle some 5,000 people knowing about it, but not 5,000,000 people :). Thirdly, this is a tool for generating a board report, not an activity monitor meant for the public. I wrote it to serve the PMCs, and the PMCs may or may not choose to have mailing list data publicly available, that is not for me to decide. OK got it, a very valid reason. What we _could_ do publicly is tie some of the gathered information into the new projects.apache.org site, and provide it through there (committership/PMC changes for instance as well as release data), but I am at the edge of what I'm willing to single-handedly do for that project (it was never meant to be a one-man project), so I'll need someone else to step in and collaborate with me on that. Sounds like a hint to me. I would like to see the mail graphs public as they are a a good barometer for the communities. I my skills is suited, I volunteer to help you, even though it will be in form of patches since I believe I do not have karma. Can you give me some pointers offlist (I will come on hipchat later). rgds jan I. With regards, Daniel. On 2015-03-10 18:04, jan i wrote: Hi. The tool is real awesome, but is there a reason to keep it so secret. I would like to share the nice mail graphs with my fellow committers, and I cannot really see any secret in the reports. My suggestion is, keep it as it is for people who do a login (that is a big help), but allow a non-login version with (nearly) the same information. I find tools like this awesome, but do not understand why we try to make a lot of this for members or PMCs only, Apache is also about transparency, and this tool only collect information (except for the private lists), that can be found publicly. rgds jan I. On 10 March 2015 at 17:42, Kevin A. McGrail kevin.mcgr...@mcgrail.com wrote: This is AWESOME! On 3/5/2015 9:31 AM, Daniel Gruno wrote: Hi Project chairs, In yesterday's email to you about your upcoming board report, we forgot to mention that we have a new tool that can help you in cobbling together a report, or just view statistics of the PMCs you are on. The new service is located at: https://reporter.apache.org and is PMC members only. Should you choose to make use of the board report template in this system, do remember to add in the important activity bits and any issues that require board activity. Next time Marvin sends you an email, it will include the URL for the reporter system. If you have ANY feedback about this system, don't hesitate to let us know! :) On behalf of the Community Development Project, Daniel.
Re: Chairs: A small addition to the Marvin email you received yesterday.
This is AWESOME! On 3/5/2015 9:31 AM, Daniel Gruno wrote: Hi Project chairs, In yesterday's email to you about your upcoming board report, we forgot to mention that we have a new tool that can help you in cobbling together a report, or just view statistics of the PMCs you are on. The new service is located at: https://reporter.apache.org and is PMC members only. Should you choose to make use of the board report template in this system, do remember to add in the important activity bits and any issues that require board activity. Next time Marvin sends you an email, it will include the URL for the reporter system. If you have ANY feedback about this system, don't hesitate to let us know! :) On behalf of the Community Development Project, Daniel.
Re: [VOTE] Replace projects.apache.org with projects-new.apache.org
On 03/06/2015 11:52 AM, Rich Bowen wrote: I'd like for us to go ahead and replace projects.apache.org with projects-new.apache.org. It now has all the functionality that projects.a.o has, and much more, and there's no reason to have two sites up. If you object to moving forward with this, please say so. [ ] +1, do it [ ] +0, whatevs [ ] -1, No (and say why, so we can address the problem) I'm going to call this a yes vote overall, with a few nits that have been addressed. Thank you all for your thoughts. Thanks, Daniel, for your work on this. And with all the folks that have said they'll get checkouts and hack on it, we should have much wonderment real soon. --Rich -- Rich Bowen - rbo...@rcbowen.com - @rbowen http://apachecon.com/ - @apachecon
Apply for mentor role in ASF for GSC
Dear ASF administrator, I am a research scientist and a senior software developer at Mayo Clinic medical informatics group. I am also a PMC member for cTAKES which is an open-source software developed by our group and released via ASF. I am quite interested in GSC and would like to serve as a mentor. I hope that more good software can be developed by young students and contribute the repository of ASF. Hope that I can be accepted by you as a mentor. Thanks, Dingcheng