Re: [Discuss] Serverless scientific computing (function as a service)
A couple of thoughts: (1) Depending on specific libraries can be an unintended but unavoidable side effect of the programming language chosen. For example, we've seen plenty of examples of Python code that's quite brittle regarding Python version (and perhaps versions of various packages). (MATLAB sometimes shows similar effects, but typically gentler than Python, and Perl does too, but I've encountered very few research software developers who develop their workhorse codes primarily in Perl.) (2) For floating point calculations, there's a reasonable argument that, if the code's replicability is dependent on the order in which various operations are executed, then that algorithm is too fragile to be relied on for meaningful research results. The problem is that floating point representation is approximate. For example, a double precision floating point number has 64 bits, meaning 2^64 possible values. But there are infinitely many real numbers, and in fact infinitely many real numbers between any two real numbers. So almost every value you try to represent has some error in its representation. On top of that, every calculation you do introduces additional error due to rounding. As an example, consider this: 1.23 * 4.56 * 7.89 = 44.25343 That's 44.3 when rounded to 3 significant figures. Now let's recalculate, but this time rounding after each operation, which is analogous to what happens in real life in floating point arithmetic: 1.23 * 4.56 = 5.6088 ~= 5.61 * 7.89 = 44.2629 ~= 44.3 1.23 * 7.89 = 9.7047 ~= 9.70 * 4.56 = 44.2320 ~= 44.2 4.56 * 7.89 = 35.9784 ~= 36.0 * 1.23 = 44.280 ~= 44.3 Which one of these is the "right" order? You can't possibly know that answer when you write the code, because the "right" order will depend on the values you're calculating on. Which you can't know while you're writing the code, or you wouldn't have had to write the code in the first place. Now imagine doing zillions of these calculations to get your result -- not just multiplies but adds, subtracts, divides, exponentiations, logarithms, cosines, you name it. You can imagine that numerical error is going to build up pretty quickly. In a "good" numerical method, that error will accumulate in a random direction with each operation, so the aggregate error won't be too bad. But in a "bad" numerical method, that error may bias in a specific direction, so the aggregate error may be terrible. In either case, the result is essentially guaranteed to be "wrong," in the sense that it's approximate, which was guaranteed from the start, because floating point representation is approximate, as above. If your definition of "replicable" is "gets the exact same result bit-by-bit," and you achieve that aim, there's a decent chance that it'll be the exact same, scientifically unacceptable, amount of wrong. Which isn't a win. So the notion that bit-by-bit replicability is the same as scientific replicability is debatable. The advantage of packages like BLAS is that they've been designed by people who know an awful lot about these issues, and so BLAS is pretty robust with respect to floating point issues. But BLAS isn't designed or intended for bit-by-bit replicability, because in large scale numerical calculations, bit-by-bit replicability isn't necessarily valuable from a scientific perspective. Henry -- On Wed, 14 Jun 2017, Bennet Fauber wrote: >Peter brings up an interesting point about code quality and its role >in replicability. It may that too strong a reliance on particular >underlying libraries is really an indication of unstable code or >unstable methods. > >Good numerical code should largely survive recompilation. A good >example of this is the code in R, I think. The R maintainers have >warnings about using MKL or other optimized libraries, and they >provide robust source code for the basic BLAS functions R needs for >those who aren't interested in or able to evaluate whether the >differences shown between the tests at compilation and the baseline >are significant for their research or not. > >Regarding containers and HPC, Singularity is making rapid inroads into >HPC centers, I think, and it will only become more prevalent now that >the Singularity people have largely got it so non-root users can >create and maintain containers. That's been a huge issue for HPC >centers like ours with Docker, which wants to run as root. >Singularity containers run entirely as the invoking user and in >unprivileged space, which makes them far less controversial. > >Cloud providers, like Amazon, or 'cloud' cluster providers like >Penguin do offer something else that is sometimes increasingly >desirable to researchers at big university's and that is independence >and portability. > >If a junior faculty member or graduate student -- or undergraduate -- >builds something in AWS and moves to a different university, there is >no interruption to the research program: AWS doesn't have to move. >Similarly
Re: [Discuss] Serverless scientific computing (function as a service)
Hi Peter, Nature had a piece on containers recently -- https://www.nature.com/news/software-simplified-1.22059 which has Lorena Barba making exactly the same points as you about software robustness! So at least the opinions are getting out there... In my experience, however, it's difficult to get scientist to understand that they should immediately stop working with their 15-year-old stack of software and pipelines and reengineer it from scratch so as to be robust ;). So bandaids are sadly the soup du decade. Less tongue-in-cheek, for many non-technical reasons, we've settled for an incredibly fragile software infrastructure. I don't see us working our way out of that anytime soon. A few thoughts and links here: http://ivory.idyll.org/blog/2017-pof-software-archivability.html best, --titus kOn Wed, Jun 14, 2017 at 09:11:47AM +0200, Peter Steinbach wrote: > Hi everyone, > > thanks for the interesting discussion so far. From my personal point of > view, I'd fully agree with the computational burst based argument. If a > robust pipeline needs to scale for a short amount of time and local HPC > resources are blocked, the cloud is an essential resource. > > However, with projects like [1] or [2] I don't buy into the argument > that using HPC is forbidding due to reproducibility of scientific > results. I know that many HPC installations are very conservative when > it comes to containerized execution (like in the cloud) and have a long > lag of implementing modern technologies, but containerized execution for > the sake of having a fixed set of dependencies can also be considered as > a lack of software quality. For me, this in turn is as a result of our > academic system of incentives, i.e. published results are valued higher > than the tools that produced them (which makes people invest less in > infrastructure). The latter often leads to brittle build systems and the > lack of tests. It's interesting (if not paradox) to me that people tend > to take money in their hands to buy compute hours in the cloud to > actually mitigate this. > > Cheers, > Peter > > [1] http://www.nersc.gov/research-and-development/user-defined-images/ > [2] http://singularity.lbl.gov/ > > > On 06/13/2017 08:12 PM, C. Titus Brown wrote: >> Hi all, >> >> we have done varying amounts of cloud computing, but it tends not be >> price competitive when developing/debugging analysis pipelines for large >> sets of data (vertebrate GWAS, etc.) because of the disk space needs. >> >> The UCSC Genome Center folk are relying increasingly on cloud computing >> because it is so flexible and burst-scalable - also see Dockstore.org >> for something that they are doing across cancer centers. >> >> With regard to Alex Savio's comment on clinical data - I don't know where in >> the world you are, Alex, but at least in the US there are several portions of >> AWS that are HIPAA-compliant. The entire UC system can use AWS for clinical >> data now, for example. I can seek out details if anyone is interested. >> >> Personally I think HPCs are a problem for reproducibility (see >> blogs.nature.com/naturejobs/2017/06/01/techblog-c-titus-brown-predicting-the-paper-of-the-future) >> for a small bit of context and am a big fan of >> computing *like* you're in the cloud (VMs or docker or singularity) so as to >> manage dependencies. But while that is something that quite a few experienced >> computational folk seem to agree with, I'm not sure how many people I will be >> able to convince of that in the broader world ;). >> >> best, >> --titus >> >> On Tue, Jun 13, 2017 at 05:54:36PM +, alexsa...@gmail.com wrote: >>> Hi Peter, >>> >>> I wouldnt be able to use such services with clinical data. It's totally not >>> an option for me. >>> Although I've seen some talks and the performance seems quite competitive >>> since scalability is easy. It's true that uploading a big quantity of data >>> can take a considerable time and bandwith, some labs use the weekends for >>> data uploading. One problem may be to convince University fund managers to >>> pay for external computing services when they already provide HPC services. >>> >>> My five cents... >>> >>> On Tue, 13 Jun 2017, 13:38 Peter Steinbach,wrote: >>> Dear both, as a side note (and my apologies for digressing), I was wondering how popular cloud computing for data processing at scale in an academic context is in the US or elsewhere? Here in Europe, many universities run their own HPC centers where people can sign up to process larger amounts of data or do larger simulations or whatnot ... mostly people here are concerned about efficiency (data connnections into the cloud are typically poor, VM overhead is considerable) and security/confidentiality when putting scientific workflows into the cloud. What is your take on this? Best, Peter PS. I love the
Re: [Discuss] Serverless scientific computing (function as a service)
Peter brings up an interesting point about code quality and its role in replicability. It may that too strong a reliance on particular underlying libraries is really an indication of unstable code or unstable methods. Good numerical code should largely survive recompilation. A good example of this is the code in R, I think. The R maintainers have warnings about using MKL or other optimized libraries, and they provide robust source code for the basic BLAS functions R needs for those who aren't interested in or able to evaluate whether the differences shown between the tests at compilation and the baseline are significant for their research or not. Regarding containers and HPC, Singularity is making rapid inroads into HPC centers, I think, and it will only become more prevalent now that the Singularity people have largely got it so non-root users can create and maintain containers. That's been a huge issue for HPC centers like ours with Docker, which wants to run as root. Singularity containers run entirely as the invoking user and in unprivileged space, which makes them far less controversial. Cloud providers, like Amazon, or 'cloud' cluster providers like Penguin do offer something else that is sometimes increasingly desirable to researchers at big university's and that is independence and portability. If a junior faculty member or graduate student -- or undergraduate -- builds something in AWS and moves to a different university, there is no interruption to the research program: AWS doesn't have to move. Similarly with something like Penguin on Demand for more traditional HPC programs (e.g., MPI-based). Doing work in the cloud also frees one from the local IT department, which may have strict rules about what can and cannot be installed on the institution's computers and how they can be used that are contrary to what is needed (or wanted) for the workflows. In some cases, research can also be done outside of academia or industry, in which case in-house infrastructure probably doesn't exist. That might be more applicable to social sciences, but I could imagine independent scholars doing work in field biology, ecology, water quality, etc. They would benefit greatly from access to computational machinery that is not only free of encumbering licensing but of institutional [sic] infrastructure. Just some more thoughts for the hearth, -- bennet On Wed, Jun 14, 2017 at 3:11 AM, Peter Steinbachwrote: > Hi everyone, > > thanks for the interesting discussion so far. From my personal point of > view, I'd fully agree with the computational burst based argument. If a > robust pipeline needs to scale for a short amount of time and local HPC > resources are blocked, the cloud is an essential resource. > > However, with projects like [1] or [2] I don't buy into the argument that > using HPC is forbidding due to reproducibility of scientific results. I know > that many HPC installations are very conservative when it comes to > containerized execution (like in the cloud) and have a long lag of > implementing modern technologies, but containerized execution for the sake > of having a fixed set of dependencies can also be considered as a lack of > software quality. For me, this in turn is as a result of our academic system > of incentives, i.e. published results are valued higher than the tools that > produced them (which makes people invest less in infrastructure). The latter > often leads to brittle build systems and the lack of tests. It's interesting > (if not paradox) to me that people tend to take money in their hands to buy > compute hours in the cloud to actually mitigate this. > > Cheers, > Peter > > [1] http://www.nersc.gov/research-and-development/user-defined-images/ > [2] http://singularity.lbl.gov/ > > > On 06/13/2017 08:12 PM, C. Titus Brown wrote: >> >> Hi all, >> >> we have done varying amounts of cloud computing, but it tends not be >> price competitive when developing/debugging analysis pipelines for large >> sets of data (vertebrate GWAS, etc.) because of the disk space needs. >> >> The UCSC Genome Center folk are relying increasingly on cloud computing >> because it is so flexible and burst-scalable - also see Dockstore.org >> for something that they are doing across cancer centers. >> >> With regard to Alex Savio's comment on clinical data - I don't know where >> in >> the world you are, Alex, but at least in the US there are several portions >> of >> AWS that are HIPAA-compliant. The entire UC system can use AWS for >> clinical >> data now, for example. I can seek out details if anyone is interested. >> >> Personally I think HPCs are a problem for reproducibility (see >> >> blogs.nature.com/naturejobs/2017/06/01/techblog-c-titus-brown-predicting-the-paper-of-the-future) >> for a small bit of context and am a big fan of >> computing *like* you're in the cloud (VMs or docker or singularity) so as >> to >> manage dependencies. But while that is something
Re: [Discuss] Serverless scientific computing (function as a service)
Hi everyone, thanks for the interesting discussion so far. From my personal point of view, I'd fully agree with the computational burst based argument. If a robust pipeline needs to scale for a short amount of time and local HPC resources are blocked, the cloud is an essential resource. However, with projects like [1] or [2] I don't buy into the argument that using HPC is forbidding due to reproducibility of scientific results. I know that many HPC installations are very conservative when it comes to containerized execution (like in the cloud) and have a long lag of implementing modern technologies, but containerized execution for the sake of having a fixed set of dependencies can also be considered as a lack of software quality. For me, this in turn is as a result of our academic system of incentives, i.e. published results are valued higher than the tools that produced them (which makes people invest less in infrastructure). The latter often leads to brittle build systems and the lack of tests. It's interesting (if not paradox) to me that people tend to take money in their hands to buy compute hours in the cloud to actually mitigate this. Cheers, Peter [1] http://www.nersc.gov/research-and-development/user-defined-images/ [2] http://singularity.lbl.gov/ On 06/13/2017 08:12 PM, C. Titus Brown wrote: Hi all, we have done varying amounts of cloud computing, but it tends not be price competitive when developing/debugging analysis pipelines for large sets of data (vertebrate GWAS, etc.) because of the disk space needs. The UCSC Genome Center folk are relying increasingly on cloud computing because it is so flexible and burst-scalable - also see Dockstore.org for something that they are doing across cancer centers. With regard to Alex Savio's comment on clinical data - I don't know where in the world you are, Alex, but at least in the US there are several portions of AWS that are HIPAA-compliant. The entire UC system can use AWS for clinical data now, for example. I can seek out details if anyone is interested. Personally I think HPCs are a problem for reproducibility (see blogs.nature.com/naturejobs/2017/06/01/techblog-c-titus-brown-predicting-the-paper-of-the-future) for a small bit of context and am a big fan of computing *like* you're in the cloud (VMs or docker or singularity) so as to manage dependencies. But while that is something that quite a few experienced computational folk seem to agree with, I'm not sure how many people I will be able to convince of that in the broader world ;). best, --titus On Tue, Jun 13, 2017 at 05:54:36PM +, alexsa...@gmail.com wrote: Hi Peter, I wouldnt be able to use such services with clinical data. It's totally not an option for me. Although I've seen some talks and the performance seems quite competitive since scalability is easy. It's true that uploading a big quantity of data can take a considerable time and bandwith, some labs use the weekends for data uploading. One problem may be to convince University fund managers to pay for external computing services when they already provide HPC services. My five cents... On Tue, 13 Jun 2017, 13:38 Peter Steinbach,wrote: Dear both, as a side note (and my apologies for digressing), I was wondering how popular cloud computing for data processing at scale in an academic context is in the US or elsewhere? Here in Europe, many universities run their own HPC centers where people can sign up to process larger amounts of data or do larger simulations or whatnot ... mostly people here are concerned about efficiency (data connnections into the cloud are typically poor, VM overhead is considerable) and security/confidentiality when putting scientific workflows into the cloud. What is your take on this? Best, Peter PS. I love the "serverless" metaphor. Get's rid of all the problems of computers. ;) On 06/12/2017 06:02 PM, Marianne Corvellec wrote: Hi Justin, Thank you so much for the quick reply! I'm going to give this new package a try. Best, Marianne On Fri, Jun 9, 2017 at 11:20 AM, Justin Kitzes wrote: Hi Marianne, PyWren by Eric Jonas sounds like it's pretty similar to what you're looking for - http://pywren.io/ It's a relatively new package that's still in active development, but Eric is very interested in expanding it (and has some support from the riselab at UC Berkeley to do so). I know that he's also actively looking for use cases, so I'd definitely suggest getting in touch with him if you're interested. Best, Justin -- Justin Kitzes Energy and Resources Group Berkeley Institute for Data Science University of California, Berkeley On Jun 9, 2017, at 6:51 AM, Marianne Corvellec < marianne.corvel...@gmail.com> wrote: Dear community, I'm curious as to whether some of you might have worked on or used solutions such as AWS Lambda in the context of your scientific research. If so, have you documented it in a blog
Re: [Discuss] Serverless scientific computing (function as a service)
I have anecdotes rather than data, but around here it's getting to be more popular as more people try it out. Folks do need to take care to look through their grant terms to see if there's any specific language about region restrictions or cloud computing restrictions. (I have a friend who keeps burning through SSDs on an under-her-desk server who would love to have Amazon's resources available but her grant specifically prohibits cloud computing.) One local example: Here at the University of Illinois we had a team spend 3 months analyzing a set of data, only to discover at the end of the 3 months that there was a software version mismatch that made that run incompatible with the rest of their results. They came to talk to my department's cloud and virtualization team about what could be done, and discovered that there was a way to re-run their work in 3 days at a really cost effective price point on Amazon Web Services with our team's help. (Our lead thinks he might be able to get that originally 3 month process down into 1 day with a little more optimization.) Our AWS team often serves as an intermediary to help guide researchers through navigating the networking, security, and optimization issues. -Dena Strong, Technology Services, University of Illinois -Original Message- From: Discuss [mailto:discuss-boun...@lists.software-carpentry.org] On Behalf Of Peter Steinbach Sent: Tuesday, June 13, 2017 6:38 AM To: discuss@lists.software-carpentry.org Subject: Re: [Discuss] Serverless scientific computing (function as a service) Dear both, as a side note (and my apologies for digressing), I was wondering how popular cloud computing for data processing at scale in an academic context is in the US or elsewhere? Here in Europe, many universities run their own HPC centers where people can sign up to process larger amounts of data or do larger simulations or whatnot ... mostly people here are concerned about efficiency (data connnections into the cloud are typically poor, VM overhead is considerable) and security/confidentiality when putting scientific workflows into the cloud. What is your take on this? Best, Peter PS. I love the "serverless" metaphor. Get's rid of all the problems of computers. ;) On 06/12/2017 06:02 PM, Marianne Corvellec wrote: > Hi Justin, > > Thank you so much for the quick reply! > > I'm going to give this new package a try. > > Best, > Marianne > > On Fri, Jun 9, 2017 at 11:20 AM, Justin Kitzes <jkit...@berkeley.edu> wrote: >> Hi Marianne, >> >> PyWren by Eric Jonas sounds like it's pretty similar to what you're >> looking for - >> >> http://pywren.io/ >> >> It's a relatively new package that's still in active development, but Eric >> is very interested in expanding it (and has some support from the riselab at >> UC Berkeley to do so). I know that he's also actively looking for use cases, >> so I'd definitely suggest getting in touch with him if you're interested. >> >> Best, >> >> Justin >> >> -- >> Justin Kitzes >> Energy and Resources Group >> Berkeley Institute for Data Science >> University of California, Berkeley >> >>> On Jun 9, 2017, at 6:51 AM, Marianne Corvellec >>> <marianne.corvel...@gmail.com> wrote: >>> >>> Dear community, >>> >>> I'm curious as to whether some of you might have worked on or used >>> solutions such as AWS Lambda in the context of your scientific >>> research. >>> >>> If so, have you documented it in a blog post that you could share? >>> Thanks in advance! >>> >>> Without even considering workflows or full-fledged projects, >>> wouldn't we want to be able to make a standard API call to, say, fit >>> a polynomial to some data? Is anyone aware of any effort in this >>> direction? >>> >>> A friend of mine just drew my attention to this general issue, which >>> touches on open science and reproducible research... In the >>> meantime, I'll encourage him to join this mailing list! >>> >>> Thank you, >>> Marianne >>> ___ >>> Discuss mailing list >>> Discuss@lists.software-carpentry.org >>> http://lists.software-carpentry.org/listinfo/discuss >> > ___ > Discuss mailing list > Discuss@lists.software-carpentry.org > http://lists.software-carpentry.org/listinfo/discuss > ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss
Re: [Discuss] Serverless scientific computing (function as a service)
Hi Peter, I wouldnt be able to use such services with clinical data. It's totally not an option for me. Although I've seen some talks and the performance seems quite competitive since scalability is easy. It's true that uploading a big quantity of data can take a considerable time and bandwith, some labs use the weekends for data uploading. One problem may be to convince University fund managers to pay for external computing services when they already provide HPC services. My five cents... On Tue, 13 Jun 2017, 13:38 Peter Steinbach,wrote: > Dear both, > > as a side note (and my apologies for digressing), I was wondering how > popular cloud computing for data processing at scale in an academic > context is in the US or elsewhere? > > Here in Europe, many universities run their own HPC centers where people > can sign up to process larger amounts of data or do larger simulations > or whatnot ... mostly people here are concerned about efficiency (data > connnections into the cloud are typically poor, VM overhead is > considerable) and security/confidentiality when putting scientific > workflows into the cloud. > What is your take on this? > > Best, > Peter > > > PS. I love the "serverless" metaphor. Get's rid of all the problems of > computers. ;) > > On 06/12/2017 06:02 PM, Marianne Corvellec wrote: > > Hi Justin, > > > > Thank you so much for the quick reply! > > > > I'm going to give this new package a try. > > > > Best, > > Marianne > > > > On Fri, Jun 9, 2017 at 11:20 AM, Justin Kitzes > wrote: > >> Hi Marianne, > >> > >> PyWren by Eric Jonas sounds like it's pretty similar to what you're > looking for - > >> > >> http://pywren.io/ > >> > >> It's a relatively new package that's still in active development, but > Eric is very interested in expanding it (and has some support from the > riselab at UC Berkeley to do so). I know that he's also actively looking > for use cases, so I'd definitely suggest getting in touch with him if > you're interested. > >> > >> Best, > >> > >> Justin > >> > >> -- > >> Justin Kitzes > >> Energy and Resources Group > >> Berkeley Institute for Data Science > >> University of California, Berkeley > >> > >>> On Jun 9, 2017, at 6:51 AM, Marianne Corvellec < > marianne.corvel...@gmail.com> wrote: > >>> > >>> Dear community, > >>> > >>> I'm curious as to whether some of you might have worked on or used > >>> solutions such as AWS Lambda in the context of your scientific > >>> research. > >>> > >>> If so, have you documented it in a blog post that you could share? > >>> Thanks in advance! > >>> > >>> Without even considering workflows or full-fledged projects, wouldn't > >>> we want to be able to make a standard API call to, say, fit a > >>> polynomial to some data? Is anyone aware of any effort in this > >>> direction? > >>> > >>> A friend of mine just drew my attention to this general issue, which > >>> touches on open science and reproducible research... In the meantime, > >>> I'll encourage him to join this mailing list! > >>> > >>> Thank you, > >>> Marianne > >>> ___ > >>> Discuss mailing list > >>> Discuss@lists.software-carpentry.org > >>> http://lists.software-carpentry.org/listinfo/discuss > >> > > ___ > > Discuss mailing list > > Discuss@lists.software-carpentry.org > > http://lists.software-carpentry.org/listinfo/discuss > > > ___ > Discuss mailing list > Discuss@lists.software-carpentry.org > http://lists.software-carpentry.org/listinfo/discuss -- Sent from my phone, sorry for brevity or typos. ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss
Re: [Discuss] Serverless scientific computing (function as a service)
Dear both, as a side note (and my apologies for digressing), I was wondering how popular cloud computing for data processing at scale in an academic context is in the US or elsewhere? Here in Europe, many universities run their own HPC centers where people can sign up to process larger amounts of data or do larger simulations or whatnot ... mostly people here are concerned about efficiency (data connnections into the cloud are typically poor, VM overhead is considerable) and security/confidentiality when putting scientific workflows into the cloud. What is your take on this? Best, Peter PS. I love the "serverless" metaphor. Get's rid of all the problems of computers. ;) On 06/12/2017 06:02 PM, Marianne Corvellec wrote: Hi Justin, Thank you so much for the quick reply! I'm going to give this new package a try. Best, Marianne On Fri, Jun 9, 2017 at 11:20 AM, Justin Kitzeswrote: Hi Marianne, PyWren by Eric Jonas sounds like it's pretty similar to what you're looking for - http://pywren.io/ It's a relatively new package that's still in active development, but Eric is very interested in expanding it (and has some support from the riselab at UC Berkeley to do so). I know that he's also actively looking for use cases, so I'd definitely suggest getting in touch with him if you're interested. Best, Justin -- Justin Kitzes Energy and Resources Group Berkeley Institute for Data Science University of California, Berkeley On Jun 9, 2017, at 6:51 AM, Marianne Corvellec wrote: Dear community, I'm curious as to whether some of you might have worked on or used solutions such as AWS Lambda in the context of your scientific research. If so, have you documented it in a blog post that you could share? Thanks in advance! Without even considering workflows or full-fledged projects, wouldn't we want to be able to make a standard API call to, say, fit a polynomial to some data? Is anyone aware of any effort in this direction? A friend of mine just drew my attention to this general issue, which touches on open science and reproducible research... In the meantime, I'll encourage him to join this mailing list! Thank you, Marianne ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss
Re: [Discuss] Serverless scientific computing (function as a service)
Hi Justin, Thank you so much for the quick reply! I'm going to give this new package a try. Best, Marianne On Fri, Jun 9, 2017 at 11:20 AM, Justin Kitzeswrote: > Hi Marianne, > > PyWren by Eric Jonas sounds like it's pretty similar to what you're looking > for - > > http://pywren.io/ > > It's a relatively new package that's still in active development, but Eric is > very interested in expanding it (and has some support from the riselab at UC > Berkeley to do so). I know that he's also actively looking for use cases, so > I'd definitely suggest getting in touch with him if you're interested. > > Best, > > Justin > > -- > Justin Kitzes > Energy and Resources Group > Berkeley Institute for Data Science > University of California, Berkeley > >> On Jun 9, 2017, at 6:51 AM, Marianne Corvellec >> wrote: >> >> Dear community, >> >> I'm curious as to whether some of you might have worked on or used >> solutions such as AWS Lambda in the context of your scientific >> research. >> >> If so, have you documented it in a blog post that you could share? >> Thanks in advance! >> >> Without even considering workflows or full-fledged projects, wouldn't >> we want to be able to make a standard API call to, say, fit a >> polynomial to some data? Is anyone aware of any effort in this >> direction? >> >> A friend of mine just drew my attention to this general issue, which >> touches on open science and reproducible research... In the meantime, >> I'll encourage him to join this mailing list! >> >> Thank you, >> Marianne >> ___ >> Discuss mailing list >> Discuss@lists.software-carpentry.org >> http://lists.software-carpentry.org/listinfo/discuss > ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss
Re: [Discuss] Serverless scientific computing (function as a service)
Hi Marianne, PyWren by Eric Jonas sounds like it's pretty similar to what you're looking for - http://pywren.io/ It's a relatively new package that's still in active development, but Eric is very interested in expanding it (and has some support from the riselab at UC Berkeley to do so). I know that he's also actively looking for use cases, so I'd definitely suggest getting in touch with him if you're interested. Best, Justin -- Justin Kitzes Energy and Resources Group Berkeley Institute for Data Science University of California, Berkeley > On Jun 9, 2017, at 6:51 AM, Marianne Corvellec> wrote: > > Dear community, > > I'm curious as to whether some of you might have worked on or used > solutions such as AWS Lambda in the context of your scientific > research. > > If so, have you documented it in a blog post that you could share? > Thanks in advance! > > Without even considering workflows or full-fledged projects, wouldn't > we want to be able to make a standard API call to, say, fit a > polynomial to some data? Is anyone aware of any effort in this > direction? > > A friend of mine just drew my attention to this general issue, which > touches on open science and reproducible research... In the meantime, > I'll encourage him to join this mailing list! > > Thank you, > Marianne > ___ > Discuss mailing list > Discuss@lists.software-carpentry.org > http://lists.software-carpentry.org/listinfo/discuss ___ Discuss mailing list Discuss@lists.software-carpentry.org http://lists.software-carpentry.org/listinfo/discuss