Thanks for the suggestions, these are certainly the main areas in which 
we're looking to address as part of this work.

I'd be interested to hear if you have more thoughts about the model 
specification/probabilistic programming language. A few other people have 
requested things like this, and this would certainly play to Julia's 
strengths (as shown by JuMP.jl). That said, a full-scale probabilistic 
programming language might be a bit too much to ask as part of this work 
(keep in mind that Stan has been 3+ year project with 2-3 full-time devs + 
volunteers), but there might be some low-hanging fruit here we can pick.

-simon

On Sunday, 27 December 2015 02:32:43 UTC, Lampkld wrote:
>
> Thanks for the response.
>
> Since you kindly asked, the following are two main areas in our assessment 
> of the general arc of the Julia ecosystem:
>
> 1. Will the roadmap obviate some of the bottlenecks for day to day normal 
> exploratory workflow?  These are minimal  things that R and Python have and 
> whose lack hamper any use of Julia for regular analysis. Thing like robust 
> dataframe with data i/o into different formats, web scraping, work out 
> nullable semantics and integration with ecosystem , robust data cleaning 
> and tidy data, modeling with basic  diagnostic tests etc
>
> 2. Will the roadmap jump leapfrog into areas and capabilities that are 
> currently not covered by other stats and data science ecosystems?
>
>  There are many here, but we are specifically looking at the ability to 
> work with modeling on medium sized out of core databases. This would 
> include an abstract dataframe like interface to said databses MySQL and 
> SQLlite, and some sort of modeling capability on the same. My dream would 
> be separation of model specification as a DAG/ probabilistic programming 
> framework, from fitting the model. Thus the same model can be fit with 
> different sort of data and optimizers. Streaming black box variation 
> inference can be a means to extend this to  OOC work. 
>
> I realize Julia won't for a while have all the statistical tests and 
> random models of python, much less R. However, a general yet powerful and 
> scalable data querying and prob programming framework could arguably  
> suffice for most python and R use cases in Data Science while provide a 
> comparative advantage over other frameworks where it counts.  To my 
> knowledge, Right now SAS and STATA are the only packages that offer general 
> modeling with on disk data sets, but the sort of capability I outlined 
> would seem to be in excess of what they offer. 
>
> A bonus would be filling out gadfly towards Ggplot and ggvis capability. 
>  
>
>
> On Thursday, December 24, 2015 at 11:50:42 AM UTC-5, Viral Shah wrote:
>>
>> What would be helpful is to know what kind of decisions you are thinking 
>> of and what are the factors. 
>>
>> I suspect within 2 weeks for sure - but it's really for the Julia stats 
>> folks to say. The idea is to get feedback and chart a course.
>>
>> -viral
>> On 24 Dec 2015 10:07 p.m., "Lampkld" <lamp...@gmail.com> wrote:
>>
>>> Sorry to bug you, but can we expect something  this or next week?  Would 
>>> be helpful in knowing until when to push some stuff off. 
>>>
>>> On Thursday, December 17, 2015 at 6:20:45 PM UTC-5, Viral Shah wrote:
>>>>
>>>>
>>>> The JuliaStats team will be publishing a general plan on stats+df in a 
>>>> few days. I doubt we will have settled on all the df issues by then, but 
>>>> at 
>>>> least there will be something to start with. 
>>>>
>>>>
>>>> -viral 
>>>>
>>>>
>>>>
>>>> > On 17-Dec-2015, at 10:15 PM, Lampkld <lamp...@gmail.com> wrote: 
>>>> > 
>>>> > Hi Viral, 
>>>> > 
>>>> > Any update on this (stats + df) by chance or idea when we can get 
>>>> one? Even a roadmap or some sort of vision or other details would help 
>>>> with 
>>>>   decision making regarding infrastructure. 
>>>> > 
>>>> > Thanks! 
>>>> > 
>>>> > On Wednesday, November 11, 2015 at 3:00:50 AM UTC-5, Viral Shah 
>>>> wrote: 
>>>> > Yes, we are really excited. This grant is to focus on core Julia 
>>>> compiler infrastructure and key math libraries. Much of the libraries 
>>>> focus 
>>>> will be on statistical Computing. 
>>>> > -viral 
>>>> > 
>>>>
>>>>

Reply via email to