+1 for a nightly pip with fixed name. We need this to track mxnet integration with other packages such as Horovod.
Sam, when do you think we can have this nightly build with a fixed name? Thanks, Lin On Sun, Jan 5, 2020 at 7:48 PM Skalicky, Sam <sska...@amazon.com.invalid> wrote: > Hi Tao, > > We dont have this yet, but we did think about putting the latest wheels in > a specific place in the s3 bucket so they are always updated. Initially we > decided not to do this since the main MXNet CD should have been fixed. But > since its still not fixed yet, we might try and go ahead and do this. > > Sam > > On Jan 5, 2020, at 6:02 PM, Lv, Tao A <tao.a...@intel.com<mailto: > tao.a...@intel.com>> wrote: > > Hi, > > How to install the latest available build of a flavor without specifying > the build date? Something like `pip install mxnet --pre`. > > Thanks, > -tao > > -----Original Message----- > From: Skalicky, Sam <sska...@amazon.com.INVALID<mailto: > sska...@amazon.com.INVALID>> > Sent: Monday, January 6, 2020 2:09 AM > To: dev@mxnet.incubator.apache.org<mailto:dev@mxnet.incubator.apache.org> > Subject: Re: Stopping nightly releases to Pypi > > Hi Haibin, > > You typed the correct URLs, the cu100 build has been failing since > December 30th but other builds have succeeded. The wheels are being > delivered into a public bucket that anyone with an AWS account can access > and go poke around, here’s the URL for web access: > > > https://s3.console.aws.amazon.com/s3/buckets/apache-mxnet/dist/2020-01-01/dist/?region=us-west-2&tab=overview > > You will have to log into your AWS account to access it however (which > means you’ll need an AWS account). > > It looks like only the following flavors are available for 2020-01-01: > mxnet > mxnet-cu92 > mxnet-cu92mkl > mxnet-mkl > > Sam > > On Jan 4, 2020, at 9:06 PM, Haibin Lin <haibin.lin....@gmail.com<mailto: > haibin.lin....@gmail.com><mailto:haibin.lin....@gmail.com>> wrote: > > I was trying the nightly builds, but none of them is available: > > pip3 install > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-01/dist/mxnet_cu100-1.6.0b20200101-py2.py3-none-manylinux1_x86_64.whl > --user > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-01/dist/mxnet_cu100-1.6.0b20200101-py2.py3-none-manylinux1_x86_64.whl--user> > pip3 install > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-02/dist/mxnet_cu100-1.6.0b20200102-py2.py3-none-manylinux1_x86_64.whl > --user > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-02/dist/mxnet_cu100-1.6.0b20200102-py2.py3-none-manylinux1_x86_64.whl--user> > pip3 install > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-03/dist/mxnet_cu100-1.6.0b20200103-py2.py3-none-manylinux1_x86_64.whl > --user > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-03/dist/mxnet_cu100-1.6.0b20200103-py2.py3-none-manylinux1_x86_64.whl--user> > pip3 install > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-04/dist/mxnet_cu100-1.6.0b20200104-py2.py3-none-manylinux1_x86_64.whl > --user > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-04/dist/mxnet_cu100-1.6.0b20200104-py2.py3-none-manylinux1_x86_64.whl--user> > > ERROR: Could not install requirement mxnet-cu100==1.6.0b20200103 from > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-03/dist/mxnet_cu100-1.6.0b20200103-py2.py3-none-manylinux1_x86_64.whl > because of HTTP error 404 Client Error: Not Found for url: > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-03/dist/mxnet_cu100-1.6.0b20200103-py2.py3-none-manylinux1_x86_64.whl > for URL > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2020-01-03/dist/mxnet_cu100-1.6.0b20200103-py2.py3-none-manylinux1_x86_64.whl > > Please let me know if I typed wrong URLs. > > 1. The discoverability of available nightly builds needs improvement. If > someone can help write a script to list all links that exist, that would be > very helpful. > 2. If any nightly build is not built successfully, how do the community > know the reason of the failure, and potentially offer helps? Currently I > don't have much visibility of the nightly build status. > > Best, > Haibin > > > On Fri, Jan 3, 2020 at 5:47 PM Pedro Larroy <pedro.larroy.li...@gmail.com > <mailto:pedro.larroy.li...@gmail.com>> > wrote: > > Just to clarify, the current CI is quite an overhead to maintain for > several reasons, this complexity is overkill for CD. Jenkins also has > constant plugin upgrades, security vulnerabilities, has to be restarted > from time to time as it stops working... and to make binary builds from an > environment which runs unsafe code, I don't think is good practice. So for > that, having a separate Jenkins, CodeBuild, Drone or using a separate > Jenkins node is the right solution. Agree with you that is just a > scheduler, but somebody is making efforts to keep it running. If you have > the appetite and resources to duplicate it for CD please go ahead. > > On Fri, Jan 3, 2020 at 3:25 PM Marco de Abreu <marco.g.ab...@gmail.com > <mailto:marco.g.ab...@gmail.com>> > wrote: > > Regarding your point of finding somebody to maintain the solution: At > Apache we usually retire things if there's no maintainer, since that > indicates that the feature/system is not of enough interest to warrant > maintenance - otherwise, someone would step up. > > While assistance in the form of a fix is always appreciated, the fix still > has to conform with the way this project and Apache operates. Next time I'd > recommend to contribute time on improving the existing community solution > instead of developing an internal system. > > -Marco > > Marco de Abreu <marco.g.ab...@gmail.com<mailto:marco.g.ab...@gmail.com>> > schrieb am Sa., 4. Jan. 2020, > 00:21: > > Sam, while I understand that this solution was developed out of necessity, > my question why a new system has been developed instead of fixing the > existing one or adapting the solution. CodeBuild is a scheduler in the same > fashion as Jenkins is. It runs code. So you can adapt it to Jenkins without > much hassle. > > I'm not volunteering for this - why should I? The role of a PMC member is > to steer the direction of the project. Just because a manager points > towards a certain direction, if doesn't mean that they're going to do it. > > Apparently there was enough time at some point to develop a new solution > from scratch. It might have been a solution for your internal team and > that's fine, but upgrading it "temporarily" to be the advertised way on the > official website is something different. > > I won't argue about how the veto can be enforced. I think it's in the best > interest of the project if we try working on a solution instead of spending > time on trying to figure out the power of the PMC. > > Pedro, that's certainly a step towards the right direction. But committers > would also need access to the control plane of the system - to trigger, > stop and audit builds. We could go down that road, but i think the fewer > systems, the better - also for the sake of maintainability. > > Best regards, > Marco > > > > Pedro Larroy <pedro.larroy.li...@gmail.com<mailto: > pedro.larroy.li...@gmail.com>> schrieb am Fr., 3. Jan. > 2020, > 20:55: > > I'm not involved in such efforts, but one possibility is to have the yaml > files that describe the pipelines for CD in the Apache repositories, would > that be acceptable from the Apache POV? In the end they should be very thin > and calling the scripts that are part of the CD packages. > > On Fri, Jan 3, 2020 at 6:56 AM Marco de Abreu < marco.g.ab...@gmail.com > <mailto:marco.g.ab...@gmail.com>> > wrote: > > Agree, but the question how a non Amazonian is able to maintain and access > the system is still open. As it stands right now, the community has taken a > step back and loses some control if we continue down that road. > > I personally am disapproving of that approach since committers are no > longer in control of that process. So far it seems like my questions were > skipped and further actions have been taken. As openness and the community > having control are part of our graduation criteria, I'm putting in my veto > with a grace period until 15th of January. Please bring the system into a > state that aligns with Apache values or revert the changes. > > -Marco > > Pedro Larroy <pedro.larroy.li...@gmail.com<mailto: > pedro.larroy.li...@gmail.com>> schrieb am Fr., 3. Jan. > 2020, > 03:33: > > CD should be separate from CI for security reasons in any case. > > > On Sat, Dec 7, 2019 at 10:04 AM Marco de Abreu < marco.g.ab...@gmail.com > <mailto:marco.g.ab...@gmail.com>> > wrote: > > Could you elaborate how a non-Amazonian is able to access, maintain and > review the CodeBuild pipeline? How come we've diverted from the community > agreed-on standard where the public Jenkins serves for the purpose of > testing and releasing MXNet? I'd be curious about the issues you're > encountering with Jenkins CI that led to a non-standard solution. > > -Marco > > > Skalicky, Sam <sska...@amazon.com.invalid<mailto: > sska...@amazon.com.invalid>> schrieb am Sa., 7. > Dez. > 2019, > 18:39: > > Hi MXNet Community, > > We have been working on getting nightly builds fixed and made available > again. We’ve made another system using AWS CodeBuild & S3 to work around > the problems with Jenkins CI, PyPI, etc. It is currently building all the > flavors and publishing to an S3 bucket here: > > > > > > > > > https://us-west-2.console.aws.amazon.com/s3/buckets/apache-mxnet/dist/?region=us-west-2 > > There are folders for each set of nightly builds, try out the wheels > starting today 2019-12-07. Builds start at 1:30am PT (9:30am > GMT) > and > arrive in the bucket 30min-2hours later. Inside each folder are the wheels > for each flavor of MXNet. Currently we’re only building for linux, builds > for windows/Mac will come later. > > If you want to download the wheels easily you can use a URL in the form > of: > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/ > > > > > > > > <YYYY-MM-DD>/dist/<mxnet_build>-1.6.0b<YYYYMMDD>-py2.py3-none-manylinux1_x86_64.whl > > Heres a set of links for today’s builds > > (Plain mxnet, no mkl no cuda) > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > (mxnet-mkl > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl(mxnet-mkl> > < > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl(mxnet-mkl > > ) > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > (mxnet-cuXXX > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl(mxnet-cuXXX> > < > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl(mxnet-cuXXX > > ) > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu90-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu92-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu100-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu101-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > (mxnet-cuXXXmkl > <https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu101-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl(mxnet-cuXXXmkl> > < > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu101-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl(mxnet-cuXXXmkl > > ) > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu90mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu92mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu100mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet_cu101mkl-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > You can easily install these pip wheels in your system either by > downloading them to your machine first and then installing by > doing: > > pip install /path/to/downloaded/wheel.whl > > Or you can install directly by just giving the link to pip like > this: > > pip install > > > > > > > > https://apache-mxnet.s3-us-west-2.amazonaws.com/dist/2019-12-07/dist/mxnet-1.6.0b20191207-py2.py3-none-manylinux1_x86_64.whl > > Credit goes to everyone involved (in no particular order) Rakesh Vasudevan > Zach Kimberg Manu Seth Sheng Zha Jun Wu Pedro Larroy Chaitanya Bapat > > Thanks! > Sam > > > On Dec 5, 2019, at 1:16 AM, Lausen, Leonard <lau...@amazon.com.INVALID > <mailto:lau...@amazon.com.INVALID> <mailto:lau...@amazon.com.INVALID>> > wrote: > > We don't loose pip by hosting on S3. We just don't host nightly releases > on Pypi servers and mirror them to several hundred mirrors immediately > after each build is published which is very expensive for the Pypi project.. > People > can > still > install the nightly builds with pip by specifying the -f option. > > Uploading weekly releases to Pypi will reduce the cost for Pypi by ~75% > [1]. It may be acceptable to Pypi, but does it make sense for us? I'm not > convinced weekly release on Pypi is a good idea. Consider one release is > buggy, users will need to wait for 7 days for a fix. It doesn't provide > good user experience. > If someone has a stronger conviction about the value of weekly releases on > Pypi, that person shall please go ahead and propose it in a separate > discussion thread. > > Currently we don't have generally working nightly builds to Pypi and as a > matter of fact we know that we can't have them due to Pypi's policy and our > apparent need for large binaries. Given this fact and that no objection was > raised by > 2019-12-05 at 05:42 UTC, I conclude we have lazy consensus on stopping > upload attempts of nightly builds to Pypi. > > With consensus established, we can change the CI job to stop trying to > upload the nightly builds and then request Pypi to increase the limit. > Then > we > have one > less blocker for the 1.6 release. > > Best regards > Leonard > > [1]: Lower cost due to less releases, but higher cost due to 500MB -> > 800MB limit increase. Assuming that the limit increase translates into > actually larger binaries. > > > On Wed, 2019-12-04 at 22:20 +0100, Marco de Abreu wrote: > Are weekly releases an option? It was brought up as concern that we might > lose pip as a pretty common distribution channel where people consume > nightly builds. I don't feel like that concern has been properly addressed > so far. > > -Marco > > Lausen, Leonard <lau...@amazon.com.invalid<mailto: > lau...@amazon.com.invalid><mailto: > lau...@amazon.com.invalid<mailto:lau...@amazon.com.invalid>>> schrieb am > Mi., 4. Dez. 2019, > 04:09: > > As a simple POC to test distribution, you can try installing MXNet based > on these 3 URLs: > > pip install --no-cache-dir > > > > > > > > > > https://mxnet-dev.s3.amazonaws.com/mxnet_cu101-1.5.1.post0-py2.py3-none-manylinux1_x86_64.whl > pip install --no-cache-dir > > > > > > > > > > https://mxnet-dev.s3-accelerate.amazonaws.com/mxnet_cu101-1.5.1.post0-py2.py3-none-manylinux1_x86_64.whl > pip install --no-cache-dir > https://d19zq12jzu4w95.cloudfront.net/ > mxnet_cu101-1.5.1.post0-py2.py3-none-manylinux1_x86_64.whl > < > > > > > > > https://d19zq12jzu4w95.cloudfront.net/mxnet_cu101-1.5.1.post0-py2.py3-none-manylinux1_x86_64.whl > > < > > > > > > > > > https://d19zq12jzu4w95.cloudfront.net/mxnet_cu101-1.5.1.post0-py2.py3-none-manylinux1_x86_64.whl > > > where --no-cache-dir prevents caching the downloaded file, for the purpose > of testing. (cu101 chosen based on large size) > > The first URL uses standard S3 bucket in US. The second uses > S3 > Accelerate > based > on CloudFront CDN. And the third uses CloudFront CDN. I'm adding the third > URL, as S3 Accelerate may or may not use all new CloudFront endpoints yet. > > Regarding voting: Uploading to Pypi is currently impossible, which is a > reality (so there is no option to continue as we do currently). Pypi folks > indicated they will unblock our uploads to Pypi once we stop uploading > nightly releases and taking up 20% of their ressources [1]. > > If there are any shortcomings or problems identified with uploading to S3, > we can work to address them. But for now, status quo is broken and this > seems the only solution addressing Pypi's problem. > > I don't mind if you state that you object to lazy consensus and start a > vote. If your "maybe [...] start a proper vote" was supposed to be an > objection to lazy consensus, please state so clearly (I'm not sure if > "maybe" > qualifies > as > objection). Though I think it only makes sense with at least 2 options to > vote on. Status quo is not a meaningful option, as it is already broken. > > Best regards > Leonard > > [1]: > > https://github.com/pypa/pypi-support/issues/50#issuecomment-560479706 > > On Tue, 2019-12-03 at 19:28 +0100, Marco de Abreu wrote: > Excellent! Could we maybe come up with a POC and a quick writeup and then > start a proper vote after everyone verified that it covers their use-cases? > -Marco > > Sheng Zha <zhash...@apache.org<mailto:zhash...@apache.org>> schrieb am > Di., 3. Dez. 2019, > 19:24: > > Yes, there is. We can also make it easier to access by using a > geo-location based DNS server so that China users are directed to that > local mirror. The rest of the world is already covered by the global > cloudfront. > > -sz > > On 2019/12/03 18:22:22, Marco de Abreu < marco.g.ab...@gmail.com<mailto: > marco.g.ab...@gmail.com> > > wrote: > Isn't there an s3 endpoint in Beijing? > > It seems like this topic still warrants some discussion and thus I'd > > prefer > if we don't move forward with lazy consensus. > > -Marco > > Tao Lv <mutou...@gmail.com<mailto:mutou...@gmail.com>> schrieb am Di., 3. > Dez. 2019, > 14:31: > > * For pypi, we can use mirrors. > > On Tue, Dec 3, 2019 at 9:28 PM Tao Lv <mutou...@gmail.com<mailto: > mutou...@gmail.com>> > wrote: > > As we have many users in China, I'm considering the accessibility of S3. > For pip, we can mirrors. > > On Tue, Dec 3, 2019 at 3:24 PM Lausen, Leonard > > <lau...@amazon.com.invalid<mailto:lau...@amazon.com.invalid> > wrote: > > I would like to remind everyone that lazy consensus is assumed if no > objections are raised before 2019-12-05 at 05:42 UTC. There has been some > > discussion > about > the proposal, but to my understanding no objections were raised. > If the proposal is accepted, MXNet releases would be installed via pip > install mxnet > > And release candidates via > > pip install --pre mxnet > > (or with the respective cuda version specifier appended etc.) > > To obtain releases built automatically from the master branch, users would > need to specify something like "-f > http://mxnet.s3.amazonaws.com/mxnet-X/nightly.html" option to pip. > Best regards > Leonard > > On Mon, 2019-12-02 at 05:42 +0000, Lausen, Leonard wrote: > Hi MXNet Community, > > since more than 2 months our binary Python nightly releases > > published > on Pypi > are broken. The problem is that our binaries exceed Pypi's size limit. > Decreasing the binary size by adding compression breaks > > third-party > libraries > loading libmxnet.so > > https://github.com/apache/incubator-mxnet/issues/16193 > Sheng requested Pypi to increase their size limit: > https://github.com/pypa/pypi-support/issues/50 > > Currently "the biggest cost for PyPI from [the many MXNet binaries with > nightly release to Pypi] is the bandwidth consumed when several hundred > mirrors attempt to mirror each release immediately after it's published". > So Pypi is not inclined to allow us to upload even larger binaries on a > nightly schedule. > Their compromise is to allow it on a weekly cadence. > > However, I would like the community to revisit the necessity of releasing > pre- release binaries to Pypi on a nightly (or weekly) cadence. > > Instead, we > can > release nightly binaries ONLY to a public S3 bucket and instruct users to > install from there. On our side, we only need to prepare a html document > that contains links to all released nightly binaries. > Finally users will install the nightly releases via > > pip install --pre mxnet-cu101 -f > > http://mxnet.s3.amazonaws.com/mxnet-cu101/ > nightly.html > > Instead of > > pip install --pre mxnet-cu101 > > Of course proper releases and release candidates should still be made > available via Pypi. Thus releases would be installed via > > pip install mxnet-cu101 > > And release candidates via > > pip install --pre mxnet-cu101 > > This will substantially reduce the costs of the Pypi project and in fact > matches the installation experience provided by PyTorch. I don't think the > benefit of not including "-f > > http://mxnet.s3.amazonaws.com/mxnet-cu101/nightly.html" > matches the costs we currently externalize to the Pypi team. > > This suggestion seems uncontroversial to me. Thus I would like to start > lazy consensus. If there are no objections, I will assume lazy > > consensus on > stopping > nightly releases to Pypi in 72hrs. > > Best regards > Leonard > >