My point is that contemporary Data Science stack is using too many different languages all way from scripting (R, Python) to statically compiled C/C++ and sometimes Fortran (R, some scipy algos are in Fortran) and even JVM based Scala. This creates artificial barriers -- data scientists play the Python/R game but struggle with Scala, software engineers write pipelines in Spark/Scala but have no interest in R. Often deploying to production requires recoding from one language to another. I hope as the field matures there would be more consolidation and unification across the language zoo. Language barriers in scientifically heavy fields are not healthy. In Statistics, Python's stats.models is a pale shadow of R's CRAN. Science community is split along the language lines that spreads already thin resources even further. --Leo
On Tuesday, July 16, 2019 at 4:45:39 PM UTC-4, Jesper Louis Andersen wrote: > > On Tue, Jul 16, 2019 at 7:18 PM Slonik Az <slon...@gmail.com <javascript:>> > wrote: > >> REPL in a static AOT compiled language is hard, yet Swift somehow managed >> to implement it. >> >> > I must disagree. The technique is somewhat well known and has a long > history. See e.g., various common lisp, and standard ml implementations. If > you are willing to accept a hybrid of a byte-code interpreter with a native > code compiler at your disposal, then ocaml and haskell will suffice in > addition. When a function is defined in the REPL you just call the > compiler, and it emits assembly language. You then mark this region as > executable in the memory, and you just jump to this when the function is > invoked. In some cases, a dispatch table is used so a function can be > replaced post-facto. It has fallen somewhat out of favor for the hybrid > approaches. Probably because modern computers are fast enough when you are > exploring. > > In my experience, most data science is about processing of data, so it is > suitable for doing science. Exploratory tools are good for understanding > the model you are working in. However, real world data processing can > require you to work on several terabytes of data (or more!). There is a > threshold where it starts becoming beneficial to optimize the processing > pipeline, especially the pre-processing parts. And lower level languages, > such as Go, tend to fare really well here. These lower level tools can then > be hooked into e.g., R and Python, empowering the exploratory part of the > system. > > Another important point is that modern computational kernels, for instance > TensorFlow, are really compilers from a data-flow graph representation to > highly optimized numerical routines. Some of which executes on specialized > numerical hardware (8-32bit floating point SIMD hardware). You can define > such a graph in Python, but then export it and use it in other systems and > pipelines. As such Python, your exploratory vehicle, provides a plug-in for > a more low-level processing pipeline. This also allows part of the graph to > run inside a mobile client. The plug in model is also followed by parallel > array processing language, see e.g., Futhark (https://futhark-lang.org/). > You embed your kernel in another system. If you read Michael Jones post, > there are important similarities. > > -- > J. > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/274c9e47-5034-489b-9ce6-f8c8bde33815%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.