Re: pro coding style
On Sat, 2012-12-01 at 17:18 +0100, Per Steffensen wrote: With change/merge-tracking in both system, the important thing must be that you do not have to throw the tracked information away before in you attempt to get your changes into the main repository. People write commit messages in many different ways and have different working habits. Inserting all commit messages from a patch would probably be quite messy for some patches, e.g. I have a tendency to make many small commits, where a lot of those are just added TODO's, cosmetic enhancements or spelling corrections. Of course, git's rebase would mitigate this. As a non-committer and newly converted git user, I'd much prefer to use git for working on Lucene/Solr patches. Michael Sokolov's analysis is spot on. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
Robert Muir wrote: Right, I'm positive this (pull requests) is github :) Well, as I said, I dont KNOW exactly where the border between git and github is, but I have a very mature logical sence and a fair amount of knowledge about information theory. The rest of the content in the mail is based only on my logical sence. I would love to be proved wrong and told that my logical sence played a trick on me :-) My thoughts are that SVN is not dumb (always thinks the best about others :-) ), it just hasnt got enough information to help merging in a better way than what we have all experienced with SVN. If SVN is not dumb, the only way you can be better is by having more information. And I would really be mistaken if this is not the reason that git is able to act in a smarter way when merging. It simply has more information than SVN! So what kind of information does it have that SVN does not have? My guess is, that it knows where code came from and how it developed throughout the life of a hierarchy of branches/forks with a history going all the way back to the original common ancestor of the branches. If you have this kind of information I would imagine you can do smarter merging, and most important you can maintain the detailed information about where code came from and how it developed even on the target side of the merge, so that you can also be smarter when that branch/fork has to be merged somewhere else. So when I said in a prior mail ...else I couldnt imagine how you get the advantages you get. Remember that when using git you actually run a repository on every developers local machines. When you commit, you commit only to you local repository. You need to push in order to have it upstreamed (as they call it) it was acutally an attempt to make an argument that all the smartness must be in git - I understand it was a very unclear argument. But the thing is that the local repository is NOT github - it is git. I dont have github on my local machine :-) And it is information about the commits I did to my local git repository that potentially can make the entire process smarter. Without information from my local git repository about how I changed the code, it is not possible (again only based on my mature logical sence) for the receiving git-repository (eventually github) to be smarter than SVN. Therefore, since the clever information tracking needs to go on on my local machine, where only git (not github) lives, I argue that the smart thing is in git and not in github. Actually I couldnt imagine that a pull requests isnt just a convenient alternative to sending a mail to the owners/committers of the target git-fork asking them to downstream this and that commit from my source git-fork. I know it is a lot to claim, based alone on logical sence, but I trust my logical sence very much :-) Below some thought about what kind of information you will need to be smarter than SVN - again just completely on top of my own head (I really dont know anything about git :-) ): Imagine a file in a repository - content of file: *abcd efgh ijkl mnop * Now we fork (branch) into three forks - fork1, fork2 and fork3 On fork1 the content of the file is changed into *abcd 1234 efgh ijkl mnop *A line between original line #1 and #2 was inserted - a line with the content 1234 On fork2 the content of the file is changed into *abcd efxyz ijkl mnop *The original line #2 was changed Now you push the changes on fork1 into fork2. The net result on fork2 obviously is *abcd 1234 efxyz ijkl mnop * In the meanwhile on fork3 the content of the file has changed into *abcd efgh ij mnop *The original line #3 was changed Now you want to push the changes on fork2 (where some of it came from fork1) into fork3 Basically you have the problem of merging the following two versions of the file: *abcd 1234 efxyz ijkl mnop *and *abcd efgh ij mnop *In SVN you have no other information than the content of the two versions of the file about to be merged. With that amount of information it is impossible to make a solid decission about the net result -- merge conflict. If you had information about the history of changes since the common original a line between original line #1 and #2 was inserted, the original line #2 was changed and the original line #3 was changed, there would be no doubt that the correct net result must be *abcd 1234 efxyz ij mnop *No merge conflict! I believe the thing about git is that is has this information and therefore that is can be smart. So back to the my line of argument, that it has to be git and not github: * The information needed to be smart in the final step has to be picked up in the early steps * The early steps are (potentially) going on outside github * Therefore github cannot be the smart one A very important thing for this to work is that everything is a fork. In git a developer does not checkout a branch to do modifications on it. He forks the
Re: pro coding style
On 12/1/2012 7:59 AM, Per Steffensen wrote: It is all about information - git has it, SVN doesnt. And my logical sence tells me that is has to be git and not github! :-) Now tell me that I am stupid :-) This kind of information (merge tracking) has been in svn since 1.5 (see http://subversion.apache.org/docs/release-notes/1.5.html#merge-tracking). I believe this perception of SVN dates from its early days, when merging was indeed much more difficult: you had to keep track of all the merges you had done, to avoid doing them again, and it was a huge mess. That has pretty much been sorted out now. Now it seems to me that the main advantage about git/github is that it doesn't create a strict boundary between committers and non-committers. As a committer, the two systems are basically the same up to differences in UI, convenience of tools, etc. But for a non-committer, with SVN the situation is irritating if you submit patches that you continue to use, but are not accepted (or not in a timely way) into the main repository. In such a case, you either have to abandon the use of source control (OUCH!), or you have to fork the entire project and maintain your own repo, with no tools for integrating with the main repo. My understanding is that with git, you can maintain your own repo, and you have tools for taking changes from upstream repos, and also that the pull request mechanism may be more convenient than submitting patches. So this sounds, on the whole, much more attractive for outside contributors. I have to admit I've only fiddled with this a bit, so this is mostly based on what I've read and heard: please tell me that I am stupid! -Mike
Re: pro coding style
Michael Sokolov skrev: This kind of information (merge tracking) has been in svn since 1.5 (see http://subversion.apache.org/docs/release-notes/1.5.html#merge-tracking). I believe this perception of SVN dates from its early days, when merging was indeed much more difficult: you had to keep track of all the merges you had done, to avoid doing them again, and it was a huge mess. That has pretty much been sorted out now. Ok, so the necessary tracking of data seems to be in both. One might be better that the other in some aspects and visa versa. I stand corrected. Now it seems to me that the main advantage about git/github is that it doesn't create a strict boundary between committers and non-committers. As a committer, the two systems are basically the same up to differences in UI, convenience of tools, etc. But for a non-committer, with SVN the situation is irritating if you submit patches that you continue to use, but are not accepted (or not in a timely way) into the main repository. In such a case, you either have to abandon the use of source control (OUCH!), or you have to fork the entire project and maintain your own repo, with no tools for integrating with the main repo. My understanding is that with git, you can maintain your own repo, and you have tools for taking changes from upstream repos, and also that the pull request mechanism may be more convenient than submitting patches. So this sounds, on the whole, much more attractive for outside contributors. I have to admit I've only fiddled with this a bit, so this is mostly based on what I've read and heard: please tell me that I am stupid! -Mike With change/merge-tracking in both system, the important thing must be that you do not have to throw the tracked information away before in you attempt to get your changes into the main repository. You certainly throw this information away when you create a dumb patch-file. Guess that we could make it work, if just outsiders where allowed to make branches in Apaches SVN - we are not :-) So I guess that is the main bennefit of git. It allows for forks from the main repository to live remote from the main repository - that is, I would be able to make a fork from the Apache git-repository (github or not), a fork that lives entirely on my system of servers. And when I want to forward changes into Apache it can go from my forked repository into the main repository (through upstreaming) without having to cross a border where the nice change/merge-tracking is lost. Still pretty sure that the stuff for this is all in git, but depeding on whether or not Apache would need access to my local repository (containing my fork) in order for the upstream from my repository to the Apache repository to be possible, when the actual action of accepting the upstream has to be on the Apache side, I dont know. With GitHub my repository would live the same place as Apaches and then certainly it would be possible. But why the discussion? Why not just GitHub?! Regards, Per Steffensen
Re: pro coding style
On Fri, Nov 30, 2012 at 8:56 AM, Robert Muir rcm...@gmail.com wrote: On Fri, Nov 30, 2012 at 8:50 AM, Per Steffensen st...@designware.dkwrote: Robert Muir skrev: Is it really git? Because its my understanding pull requests aren't actually a git thing but a github thing. The distinction is important. Actually Im not sure. Have never used git outside github, but at least part of it has to be git and not github (I think) - or else I couldnt imagine how you get the advantages you get. Remember that when using git you actually run a repository on every developers local machines. When you commit, you commit only to you local repository. You need to push in order to have it upstreamed (as they call it) Right, I'm positive this (pull requests) is github :) I just wanted make this point: when we have discussions about using git instead of svn, I'm not sure it makes things easier on anyone, actually probably worse and more complex. Its the github workflow that contributors want (I would +1 some scheme that supports this!), but git by itself, is pretty unusable. Github is like a nice front-end to this mess. This is like a medicine to me! With all the craze about git (and we use it for our main project and also for solr development) it just confirms my 3 years-long experience. Git is pain. Github is great (too bad there is git behind it ;)) And now the problems of forks - with git the fork is the natural evil - git just makes it established practice. But it still doesn't save us from the (slow) process of incorporating new patches. While it is inevitable and we cannot be more grateful to all the committers for their hard work (really thanks!) perhaps there is a way to make solr/lucene more sandbox friendly? In our organization we are doing something similar (to using SOLR as a library), the automated build/deployment goes like this: - checkout our sources - downloadbuild solr sources - compile our code - merge with solr test - deploy This avoids forking solr and we always develop against the chosen branch, the pain was in porting the solr build infrastructure - if there was this infrastructure inside solr, ready for developers to take advantage of it, others were saved the pain or reinventing it. As far as I am aware, there is only one hard problem - the confusing nature of the classloaders inside webcontainers, i have really had hard time understanding it to make it right - but there are surely more knowledgeable people here. And if the worst comes to worst, the automated procedure could easily merge jars. Sounds evil? Is forking Solr a better way? roman
Re: pro coding style
or you will lose contributors I think the type of people we are looking for tend to stick around ;) I know several companies with forked SOLR. Why? Reason is that is fucking difficult to get their patches into SOLR in time. You are losing that way most valuable contributions. You need to work faster to keep them interested. Also you mentioned that contributor patches are for your project low priority, this is why you are losing them. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
Instead of educating others about what's good and bad how about if you take some more time studying the sources of Lucene/ Solr and its build system? i did, i had to figure how to build that thing. In standardized maven environment: mvn package is all you need to do. No need to spend minutes reading ant scripts. Your observations are superficial to say the least: POM files are generated dynamically Yes, its common sport in ant project to generate POM files for uploaded artifacts. You guys took it to next level? Generate poms with ant + then build with maven? - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
On Fri, Nov 30, 2012 at 3:56 PM, Robert Muir rcm...@gmail.com wrote: Right, I'm positive this (pull requests) is github :) Just a note - pull request has been a git concept before github embraced and extended it. However, almost nobody uses the old meaning, and it's really only useful for projects like the Linux kernel, where everything is done through the mailing list. http://stackoverflow.com/a/6235394/7581
Re: pro coding style
In the past git had bad tooling, that is not the case today. I've been using git also without github screens - and while they definitely add a lot, it is still ten times more usable than SVN. As I told the Lucene.NET mailing list, you should all watch the following video and give git a few days of your time before continuing with this discussion: http://www.youtube.com/watch?v=4XpnKHJAok8 Also, Apache mirrors to github, so basically you work against github all the time On Fri, Nov 30, 2012 at 4:15 PM, Robert Muir rcm...@gmail.com wrote: On Fri, Nov 30, 2012 at 9:10 AM, Mark Miller markrmil...@gmail.comwrote: On Nov 30, 2012, at 8:56 AM, Robert Muir rcm...@gmail.com wrote: but git by itself, is pretty unusable. Given the number of committers that eat some pain to use git when developing lucene/solr, and have no github or pull requests, I'm not sure that's a common though :) Sure, some people might disagree with me. I'm more than willing to eat some pain if it makes contributions easier. I just feel like a lot of what makes github successful is unfortunately actually in github and not git. Its like if your development team is screaming for linux machines. You have to be careful how to interpret that. If you hand them a bunch of machines with just linux kernels, they probably won't be productive. When they scream for linux they want a userland with a shell, compiler, X-windows, editor and so on too.
Re: pro coding style
On Sun, Dec 2, 2012 at 2:13 AM, Israel Tsadok itsa...@gmail.com wrote: On Fri, Nov 30, 2012 at 3:56 PM, Robert Muir rcm...@gmail.com wrote: Right, I'm positive this (pull requests) is github :) Just a note - pull request has been a git concept before github embraced and extended it. However, almost nobody uses the old meaning, and it's really only useful for projects like the Linux kernel, where everything is done through the mailing list. http://stackoverflow.com/a/6235394/7581 Dude the old meaning ('git-request-pull' ) basically creates a patch at best Thats not at all whats being discussed here.
Re: pro coding style
Its also classic git brokenness to have confusing names like this, like a command called git-request-pull that doesn't do anything like a pull request. These are the reasons why git is unusable! On Sun, Dec 2, 2012 at 2:47 AM, Robert Muir rcm...@gmail.com wrote: On Sun, Dec 2, 2012 at 2:13 AM, Israel Tsadok itsa...@gmail.com wrote: On Fri, Nov 30, 2012 at 3:56 PM, Robert Muir rcm...@gmail.com wrote: Right, I'm positive this (pull requests) is github :) Just a note - pull request has been a git concept before github embraced and extended it. However, almost nobody uses the old meaning, and it's really only useful for projects like the Linux kernel, where everything is done through the mailing list. http://stackoverflow.com/a/6235394/7581 Dude the old meaning ('git-request-pull' ) basically creates a patch at best Thats not at all whats being discussed here.
Re: pro coding style
Everything below is my humble opinion and input - DONT MEAN TO OFFEND ANYONE Radim Kolar wrote: what you should do: * stuff i do Like people with confidence, but it is a balance :-) Every decent developer in the world believes that he is the best in the world. Chance is that he is not. Be humble. + * ant - maven Maven is a step forward, but it is still crap. Believe the original creator of ant has apologized in public for basing it on XML. Maven is also based on XML, besides being way to complex in infrastructure - goal, phases, environments, strange plugin with exections mapping to phases etc. XML is good for static data/config stuff, but build process is not static data/config - it is a process. Go gradle! I dont have either, if i decide to go with SOLR instead of EC, i will fork it. It will save me lot of time. We are baiscally handling our own version of Solr at my organization, because it is so hard go get contributions in - SOLR-3173, SOLR-3178, SOLR-3382, SOLR-3428, SOLR-3383 etc - and lately SOLR-4114 and SOLR-4120. It is really hard keeping up with the latest versions of Apache Solr, because it is a huge job to merge new stuff into our Solr. We are considering to take the consequence and fork our own public (to let others bennefit and contribute) variant of Solr. I understand that no committers are really assigned to focus on committing other peoples stuff, but it is a shame. I would really, really not like Solr to end up in a situation, where many organizations run their own little fork. Instead we should all collaborate on improving the one and only Solr! Maybe we should try to find a sponsor to pay for a full-time Solr committer with the main focus on verifying and committing contributions from the outside. * svn - git (way better tools) I think we had this discussion already and it seems that lots of folks are positive, yet there is still some barrier infrasturcuture wise along the lines. dont blame infrastructure, other apache projects are using it. Git is the way forward. It will also make comitting outside contributions easier (especially if the commit is to be performed after the branch has developed a lot since the pull-request was made). Merging among branches will also become easier. Why? Basically, since a pull request (request to merge) is a operation handled/know by git, i allows for git to maintain more information about where merged code fits into the code-base considering revisions etc. That information can be used to ease future or late merges. * split code into small manageable maven modules see above - we have a fully functional maven build but ant is out primary build. i dont see pom.xml in your source tree. Have a look at templates in dev-tools/maven. Do a ant -Dversion=$VERSION get-maven-poms to get your maven stuff generated in folder maven-build. Maven build does not work 100% out of the box, (at least on lucene_solr_4_0 branch) but it is very close. * use github to track patches wait why is github good for patches? you can track patch revisions and apply/browse/comment it easily. Also its way easier to upload it and do pull request then attach to ticket in jira. See comments under git above Besides that I have some additional input, now that we are talking Basically that code is a mess. Not blaming anyone particular. Its probably to some extend the nature of open source. If someone honestly belive that the code-base is beautiful, they should find something else to do. Some of the major problems are * Bad separation of concerns ** Very long classes/methods dealing with a lot of different concerns *** Example: DistributedUpdateProcessor - dealing with cloud/standalone-, phases-, optimistic-locking, calculating values for document-fields (for add/inc/set requests), routing etc. This should all be separated into different classes each dealing with the a different concern ** Code dealing with a particular concern is spread all over the code - it makes it very hard to change strategy for this concern *** Example: An obvious separate concern is routing (the decision about in which shard under a collection a particualr document belongs (should be indexed and found) and where particualr request needs to go - leaders, replica, all shards under the collection?). This concern is dealt with in a lot of places - DistributedUpdateProcessor, CloudSolrServer, RealTimeGetComponent, SearchHandler etc. ** In my patch for TLT-3178 I have made a separate concern called UpdateSemantics. It deals with decissions on stuff related to how updates should be performed, depending on what update-semantics you have choosen (classic, consistency or classic-consistency-hybrid). This class UpdateSemantics is used from the actual updating component DirectUpdateHandler2 - instead of having a lot of if-else-if-else statements in DirectUpdateHandler2 itself * Copied code ** A lot of code is clearly just copied from another
Re: pro coding style
I see your point about bringing up bugs nobody thought to cover manually, but it also has cons - e.g. violating the principal that tests should be (easily) repeatable (you will/can end up with tests the sometimes fail and sometimes succeed, and you have to dig out the random values of the tests that fail in order to be able to repeat/reconstruct the fail) Randomized tests should be identical in their execution given the same seed, it's the same principle as with regular tests but expands on different code paths every time you execute with a different seed. They are not a replacement for boundary condition tests, they're a complementary thing that should allow picking things you haven't thought of. Sure, in case of a failure you need to find the seed that caused the problem but that doesn't seem like a lot of effort given the potential profit. If you want identical runs -- fix the initial seed. If you have a non-deterministic test for a given fixed seed, it'd be equally non-deterministic if no randomization was used, it's just a flawed test (or inherently non-deterministic by nature so assertions should be relaxed). Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
Per Steffensen skrev: Spot on! Good arguments. When you just do not think of randomized tests as a replacement for boundary condition tests etc Thanks. Will consider randomized for my projects in the future - with limits :-) Regards, Per Steffensen Dawid Weiss skrev: I see your point about bringing up bugs nobody thought to cover manually, but it also has cons - e.g. violating the principal that tests should be (easily) repeatable (you will/can end up with tests the sometimes fail and sometimes succeed, and you have to dig out the random values of the tests that fail in order to be able to repeat/reconstruct the fail) Randomized tests should be identical in their execution given the same seed, it's the same principle as with regular tests but expands on different code paths every time you execute with a different seed. They are not a replacement for boundary condition tests, they're a complementary thing that should allow picking things you haven't thought of. Sure, in case of a failure you need to find the seed that caused the problem but that doesn't seem like a lot of effort given the potential profit. If you want identical runs -- fix the initial seed. If you have a non-deterministic test for a given fixed seed, it'd be equally non-deterministic if no randomization was used, it's just a flawed test (or inherently non-deterministic by nature so assertions should be relaxed). Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
When you just do not think of randomized tests as a replacement for boundary condition tests etc I never claimed they were; in fact, I always make it very explicit that it's just another tool for yet another type of problems. I typically write the tests for the conditions I can think of and put a randomized test as an addition. And guess what typically fails first ;) Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
On Fri, Nov 30, 2012 at 8:50 AM, Per Steffensen st...@designware.dk wrote: Robert Muir skrev: Is it really git? Because its my understanding pull requests aren't actually a git thing but a github thing. The distinction is important. Actually Im not sure. Have never used git outside github, but at least part of it has to be git and not github (I think) - or else I couldnt imagine how you get the advantages you get. Remember that when using git you actually run a repository on every developers local machines. When you commit, you commit only to you local repository. You need to push in order to have it upstreamed (as they call it) Right, I'm positive this (pull requests) is github :) I just wanted make this point: when we have discussions about using git instead of svn, I'm not sure it makes things easier on anyone, actually probably worse and more complex. Its the github workflow that contributors want (I would +1 some scheme that supports this!), but git by itself, is pretty unusable. Github is like a nice front-end to this mess.
Re: pro coding style
On Nov 30, 2012, at 8:56 AM, Robert Muir rcm...@gmail.com wrote: but git by itself, is pretty unusable. Given the number of committers that eat some pain to use git when developing lucene/solr, and have no github or pull requests, I'm not sure that's a common though :) - Mark - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
On Fri, Nov 30, 2012 at 9:10 AM, Mark Miller markrmil...@gmail.com wrote: On Nov 30, 2012, at 8:56 AM, Robert Muir rcm...@gmail.com wrote: but git by itself, is pretty unusable. Given the number of committers that eat some pain to use git when developing lucene/solr, and have no github or pull requests, I'm not sure that's a common though :) Sure, some people might disagree with me. I'm more than willing to eat some pain if it makes contributions easier. I just feel like a lot of what makes github successful is unfortunately actually in github and not git. Its like if your development team is screaming for linux machines. You have to be careful how to interpret that. If you hand them a bunch of machines with just linux kernels, they probably won't be productive. When they scream for linux they want a userland with a shell, compiler, X-windows, editor and so on too.
Re: pro coding style
On Fri, Nov 30, 2012 at 3:48 PM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: RandomizedTesting for the win! Thanks a ton Dawid. +1 -- Adrien
Re: pro coding style
On Fri, Nov 30, 2012 at 9:52 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: RandomizedTesting for the win! Thanks a ton Dawid. I didn't invent this thing, I merely wrapped it up, cleaned up the rough edges and extracted to a stand-alone package. Lucene/Solr contributors should be credited for introducing the concept. And there's also research literature dating waaay back so I don't think the concept it entirely new -- it just never caught on. Caught on slowly... I had been using it before I became a Lucene committer in '05 and used it in Lucene/Solr for anything that had enough complexity to warrant it. https://issues.apache.org/jira/browse/LUCENE-395?focusedCommentId=12356746page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12356746 And one of my personal favorites, I think the first random indexing test - TestStressIndexing2 https://issues.apache.org/jira/browse/LUCENE-1173?focusedCommentId=12567845page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12567845 But yeah, it's only become a religion here recently. The support in the framework is certainly welcome! -Yonik http://lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
Caught on slowly... I had been using it before I became a Lucene Yep, so did I albeit in a sligtly different flavor -- always starting from a static seed and running a certain number of randomized iterations of things, usually higher level. Kind of sanity checking I guess. I don't know why I hadn't thought of just picking a different seed every time. But yeah, it's only become a religion here recently. Come on, I don't think it's that bad :) We may differ in opinions on certain things (like which tests to run and when) but I think everyone shares the same overall goal of having well tested code. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
pro coding style
if you talk about my yesterday work then no reformats were done because code was already properly formatted. Also all code was hand written, no generated code was used. Generated code is not committed to git anyway. my hard limits for code quality (checked at commit): * no findbugs warnings with level 14+ * code coverage 80% * code coverage in critical parts 95% * list of PMD warnings to stop commit * generation of call tree graph - check it for cycles, checking for calling same procedure from different levels (indicates bad code flow) * all eclipse warnings turned into errors * patched eclipse compiler to do better flow analysis * code reformatted at commit * javadoc everything, no warnings what you should do: * stuff i do + * ant - maven * svn - git (way better tools) * split code into small manageable maven modules * get more people * put trust into your testing, not into perfect people * work faster * use github to track patches * use springs for integration testing * use jenkins to do tests on incoming patches * do library checks for number of functions really used * contributor patches should be high priority or you will lose contributors i am giving sometimes lessons: about 1-2 sessions per year for 14 people, if i have spare time. But its waste of time, most ppl will not follow. learn this: SLOW CODING != BUG FREE CODE. GOOD TESTS + GOOD STATIC TESTING = GOOD BUG FREE CODE CODE STYLE != GAME WITH SPACES AND { } GOOD TESTS = 2x TIME NEEDED TO CODE STUFF UNDER TEST GOOD TESTS ARE MORE VALUABLE THEN GOOD CODE - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
hey, some comments inline... On Thu, Nov 29, 2012 at 7:48 PM, Radim Kolar h...@filez.com wrote: if you talk about my yesterday work then no reformats were done because code was already properly formatted. Also all code was hand written, no generated code was used. Generated code is not committed to git anyway. my hard limits for code quality (checked at commit): * no findbugs warnings with level 14+ * code coverage 80% * code coverage in critical parts 95% * list of PMD warnings to stop commit * generation of call tree graph - check it for cycles, checking for calling same procedure from different levels (indicates bad code flow) * all eclipse warnings turned into errors * patched eclipse compiler to do better flow analysis * code reformatted at commit * javadoc everything, no warnings what you should do: * stuff i do + * ant - maven I suggest you start with this, make sure you have enough time and energy for the discussion. * svn - git (way better tools) I think we had this discussion already and it seems that lots of folks are positive, yet there is still some barrier infrasturcuture wise along the lines. * split code into small manageable maven modules see above - we have a fully functional maven build but ant is out primary build. My honest opinion forget what I said above - don't try. * get more people good point - can you refere us some, in my experience they are pretty hard to find. * put trust into your testing, not into perfect people ahh yeah testing, we should do that at some point * work faster wow - I never thought about that though! * use github to track patches wait why is github good for patches? * use springs for integration testing sorry we are a no-dependency library. * use jenkins to do tests on incoming patches patches welcome * do library checks for number of functions really used hmm - we are a library? * contributor patches should be high priority or you will lose contributors thats is a good advice for such a young project. i am giving sometimes lessons: about 1-2 sessions per year for 14 people, if i have spare time. But its waste of time, most ppl will not follow. learn this: SLOW CODING != BUG FREE CODE. GOOD TESTS + GOOD STATIC TESTING = GOOD BUG FREE CODE CODE STYLE != GAME WITH SPACES AND { } GOOD TESTS = 2x TIME NEEDED TO CODE STUFF UNDER TEST GOOD TESTS ARE MORE VALUABLE THEN GOOD CODE lets drop the code its a hassle to maintain anyway! thanks man, this mail made my day! simon - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
what you should do: * stuff i do + * ant - maven I suggest you start with this, make sure you have enough time and energy for the discussion. I dont have either, if i decide to go with SOLR instead of EC, i will fork it. It will save me lot of time. * svn - git (way better tools) I think we had this discussion already and it seems that lots of folks are positive, yet there is still some barrier infrasturcuture wise along the lines. dont blame infrastructure, other apache projects are using it. * split code into small manageable maven modules see above - we have a fully functional maven build but ant is out primary build. i dont see pom.xml in your source tree. good point - can you refere us some, in my experience they are pretty hard to find. i do not know people who believe that process designed to be slow is a good process. We here believe that fast process = high salary. * use github to track patches wait why is github good for patches? you can track patch revisions and apply/browse/comment it easily. Also its way easier to upload it and do pull request then attach to ticket in jira. * use springs for integration testing sorry we are a no-dependency library. scopetest/scope - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: pro coding style
i dont see pom.xml in your source tree. Instead of educating others about what's good and bad how about if you take some more time studying the sources of Lucene/ Solr and its build system? Your observations are superficial to say the least: POM files are generated dynamically, the test infrastructure is among the more sophisticated things to be found; with multiple CI systems running the code all the time, the coverage is great across JVMs, the randomization really brings up bugs nobody thought to cover manually. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org