Re: Possible to exclude directories from analysis?
On Wed, 2018-09-12 at 14:37 +0200, Daniel Gruno wrote: > top posting, yaay! > I have a new server sort of set up now. I'll have infra redirect to > that > one instead, and it'll rebuild most of the database during the night > (and the next night, and...). > > there's a new option, a path filter in the repos tab, which filters > commits, line changes, trends, top contributors etc by paths > affected, > so you can enter either 'jbake' to get everything touching jbake, or > '!jbake' to get everything that doesn't touch those files. > > The CNAME should switch over some time today :) Cool, thanks! I'll give this a shot later. Robert > > With regards, > Daniel. > > PS: We're also switching to elasticsearch 6 with this move, which is > going to be great, as that allows us to test on a modern ES, instead > of > the old 5.x installation we're currently running on. > > On 09/12/2018 12:24 PM, Daniel Gruno wrote: > > On 09/12/2018 12:22 PM, Robert Munteanu wrote: > > > > > > If you look at the sling-site repository at [1] we have the > > > actual > > > documentation under src/main/jbake, with > > > - content being markdown files > > > - templates being ... well ... templates > > > - and assets being static files > > > > > > Some of those static files are generated (javadoc, Maven plugin > > > sites) > > > and should not be recorded by Kiddle. Those are the ones that are > > > problematic, especially since we have a large number of javadocs > > > committed. > > > > > > $ find src/main/jbake/assets/apidocs -type f | wc -l > > > 7127 > > > > > > Those are the ones we'd like excluded, if at all possible. > > > > > > Thanks, > > > > > > Robert > > > > > > [1]: > > > https://github.com/apache/sling-site/tree/master/src/main/jbake > > > > > > > > > > I've added a change to the scanners, so they will put a list of > > files > > changes into each commit object we record. This is likely going to > > require a complete re-scan of all things sling...which in theory > > is > > fine, as I _was_ planning on moving the demo server to a new box > > anyway. > > I'll let y'all know more when I have that worked out in my mind :) > > After the move and re-scan, it should be possible to exclude by > > file paths. > >
Re: Possible to exclude directories from analysis?
top posting, yaay! I have a new server sort of set up now. I'll have infra redirect to that one instead, and it'll rebuild most of the database during the night (and the next night, and...). there's a new option, a path filter in the repos tab, which filters commits, line changes, trends, top contributors etc by paths affected, so you can enter either 'jbake' to get everything touching jbake, or '!jbake' to get everything that doesn't touch those files. The CNAME should switch over some time today :) With regards, Daniel. PS: We're also switching to elasticsearch 6 with this move, which is going to be great, as that allows us to test on a modern ES, instead of the old 5.x installation we're currently running on. On 09/12/2018 12:24 PM, Daniel Gruno wrote: On 09/12/2018 12:22 PM, Robert Munteanu wrote: If you look at the sling-site repository at [1] we have the actual documentation under src/main/jbake, with - content being markdown files - templates being ... well ... templates - and assets being static files Some of those static files are generated (javadoc, Maven plugin sites) and should not be recorded by Kiddle. Those are the ones that are problematic, especially since we have a large number of javadocs committed. $ find src/main/jbake/assets/apidocs -type f | wc -l 7127 Those are the ones we'd like excluded, if at all possible. Thanks, Robert [1]: https://github.com/apache/sling-site/tree/master/src/main/jbake I've added a change to the scanners, so they will put a list of files changes into each commit object we record. This is likely going to require a complete re-scan of all things sling...which in theory is fine, as I _was_ planning on moving the demo server to a new box anyway. I'll let y'all know more when I have that worked out in my mind :) After the move and re-scan, it should be possible to exclude by file paths.
Re: Possible to exclude directories from analysis?
On 09/12/2018 12:22 PM, Robert Munteanu wrote: If you look at the sling-site repository at [1] we have the actual documentation under src/main/jbake, with - content being markdown files - templates being ... well ... templates - and assets being static files Some of those static files are generated (javadoc, Maven plugin sites) and should not be recorded by Kiddle. Those are the ones that are problematic, especially since we have a large number of javadocs committed. $ find src/main/jbake/assets/apidocs -type f | wc -l 7127 Those are the ones we'd like excluded, if at all possible. Thanks, Robert [1]: https://github.com/apache/sling-site/tree/master/src/main/jbake I've added a change to the scanners, so they will put a list of files changes into each commit object we record. This is likely going to require a complete re-scan of all things sling...which in theory is fine, as I _was_ planning on moving the demo server to a new box anyway. I'll let y'all know more when I have that worked out in my mind :) After the move and re-scan, it should be possible to exclude by file paths.
Re: Possible to exclude directories from analysis?
On Wed, 2018-09-12 at 12:05 +0200, Daniel Gruno wrote: > On 09/12/2018 12:00 PM, Robert Munteanu wrote: > > On Sat, 2018-09-08 at 12:54 +0200, Daniel Gruno wrote: > > > On 09/05/2018 08:38 PM, Robert Munteanu wrote: > > > > Hi, > > > > > > > > I'm using the demo Kibble instance to visualise code > > > > contributions > > > > for > > > > the Apache Sling project. One thing I noticed is that Kibble > > > > things > > > > we're 75% HTML, which is not right - we're a Java project. > > > > > > > > I think it's due to the fact that we use gitpubsub and have > > > > registered > > > > our github.com/apache/sling-site repository with kibble. That > > > > repository's master branch holds all the HTML we publish, > > > > including > > > > lots of Javadocs, Maven plug-in documentation, etc. > > > > > > The easiest path would be to simply exclude the sling-site > > > repository > > > in > > > your reports. If you're using a quick filter, instead of > > > filtering > > > on > > > 'sling', you could do a negative lookahead and filter on > > > 'sling(?!-site)' as the quick filter accepts regular expressions. > > > > Thanks for the suggestestion. I ended up excluding the sling-site > > repository completely from the 'Apache Sling' view. It's not ideal > > as > > it does not capture documentation contributions, which are quite > > important as well. > > > > It would be great if in the future we would have a more fine- > > grained > > solution. > > Ideal solutions are rare :) > Could you elaborate on exactly *what* you want to see, and what you > want > to filter away? Some things may be possible, but when you have to do > aggregations on something like 3 million commits in real-time, it > gets > tricky to exclude paths and individual files without throwing a huge > lag > spike into the mix. If you look at the sling-site repository at [1] we have the actual documentation under src/main/jbake, with - content being markdown files - templates being ... well ... templates - and assets being static files Some of those static files are generated (javadoc, Maven plugin sites) and should not be recorded by Kiddle. Those are the ones that are problematic, especially since we have a large number of javadocs committed. $ find src/main/jbake/assets/apidocs -type f | wc -l 7127 Those are the ones we'd like excluded, if at all possible. Thanks, Robert [1]: https://github.com/apache/sling-site/tree/master/src/main/jbake
Re: Possible to exclude directories from analysis?
On 09/12/2018 12:00 PM, Robert Munteanu wrote: On Sat, 2018-09-08 at 12:54 +0200, Daniel Gruno wrote: On 09/05/2018 08:38 PM, Robert Munteanu wrote: Hi, I'm using the demo Kibble instance to visualise code contributions for the Apache Sling project. One thing I noticed is that Kibble things we're 75% HTML, which is not right - we're a Java project. I think it's due to the fact that we use gitpubsub and have registered our github.com/apache/sling-site repository with kibble. That repository's master branch holds all the HTML we publish, including lots of Javadocs, Maven plug-in documentation, etc. The easiest path would be to simply exclude the sling-site repository in your reports. If you're using a quick filter, instead of filtering on 'sling', you could do a negative lookahead and filter on 'sling(?!-site)' as the quick filter accepts regular expressions. Thanks for the suggestestion. I ended up excluding the sling-site repository completely from the 'Apache Sling' view. It's not ideal as it does not capture documentation contributions, which are quite important as well. It would be great if in the future we would have a more fine-grained solution. Ideal solutions are rare :) Could you elaborate on exactly *what* you want to see, and what you want to filter away? Some things may be possible, but when you have to do aggregations on something like 3 million commits in real-time, it gets tricky to exclude paths and individual files without throwing a huge lag spike into the mix. Thanks, Robert
Re: Possible to exclude directories from analysis?
On Sat, 2018-09-08 at 12:54 +0200, Daniel Gruno wrote: > On 09/05/2018 08:38 PM, Robert Munteanu wrote: > > Hi, > > > > I'm using the demo Kibble instance to visualise code contributions > > for > > the Apache Sling project. One thing I noticed is that Kibble things > > we're 75% HTML, which is not right - we're a Java project. > > > > I think it's due to the fact that we use gitpubsub and have > > registered > > our github.com/apache/sling-site repository with kibble. That > > repository's master branch holds all the HTML we publish, including > > lots of Javadocs, Maven plug-in documentation, etc. > > The easiest path would be to simply exclude the sling-site repository > in > your reports. If you're using a quick filter, instead of filtering > on > 'sling', you could do a negative lookahead and filter on > 'sling(?!-site)' as the quick filter accepts regular expressions. Thanks for the suggestestion. I ended up excluding the sling-site repository completely from the 'Apache Sling' view. It's not ideal as it does not capture documentation contributions, which are quite important as well. It would be great if in the future we would have a more fine-grained solution. Thanks, Robert
Re: Possible to exclude directories from analysis?
On 09/05/2018 08:38 PM, Robert Munteanu wrote: Hi, I'm using the demo Kibble instance to visualise code contributions for the Apache Sling project. One thing I noticed is that Kibble things we're 75% HTML, which is not right - we're a Java project. I think it's due to the fact that we use gitpubsub and have registered our github.com/apache/sling-site repository with kibble. That repository's master branch holds all the HTML we publish, including lots of Javadocs, Maven plug-in documentation, etc. The easiest path would be to simply exclude the sling-site repository in your reports. If you're using a quick filter, instead of filtering on 'sling', you could do a negative lookahead and filter on 'sling(?!-site)' as the quick filter accepts regular expressions. With regards, Daniel. Is it possible to exclude a certain directory from analysis, to make the statistics more relevant? Thanks, Robert