There’s a number of real-world open source Spark applications in the sciences:

genomics:

github.com/bigdatagenomics/adam <http://github.com/bigdatagenomics/adam> <— 
core is scala, has py/r wrappers
https://github.com/broadinstitute/gatk <https://github.com/broadinstitute/gatk> 
<— core is java
https://github.com/hail-is/hail <https://github.com/hail-is/hail> <— core is 
scala, mostly used through python wrappers

neuroscience:

https://github.com/thunder-project/thunder#using-with-spark 
<https://github.com/thunder-project/thunder#using-with-spark> <— pyspark

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466

> On Jul 25, 2017, at 8:09 AM, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> Continuous integration (Travis, jenkins) and reporting on unit tests, 
> integration tests etc for each source code version.
> 
> On 25. Jul 2017, at 16:58, Adaryl Wakefield <adaryl.wakefi...@hotmail.com 
> <mailto:adaryl.wakefi...@hotmail.com>> wrote:
> 
>> ci+reporting? I’ve never heard of that term before. What is that?
>>  
>> Adaryl "Bob" Wakefield, MBA
>> Principal
>> Mass Street Analytics, LLC
>> 913.938.6685
>> www.massstreet.net <http://www.massstreet.net/>
>> www.linkedin.com/in/bobwakefieldmba 
>> <http://www.linkedin.com/in/bobwakefieldmba>
>> Twitter: @BobLovesData <http://twitter.com/BobLovesData>
>>  
>>  
>> From: Jörn Franke [mailto:jornfra...@gmail.com 
>> <mailto:jornfra...@gmail.com>] 
>> Sent: Tuesday, July 25, 2017 8:31 AM
>> To: Adaryl Wakefield <adaryl.wakefi...@hotmail.com 
>> <mailto:adaryl.wakefi...@hotmail.com>>
>> Cc: user@spark.apache.org <mailto:user@spark.apache.org>
>> Subject: Re: real world spark code
>>  
>> Look for the ones that have unit and integration tests as well as a 
>> ci+reporting on code quality.
>>  
>> All the others are just toy examples. Well should be :)
>> 
>> On 25. Jul 2017, at 01:08, Adaryl Wakefield <adaryl.wakefi...@hotmail.com 
>> <mailto:adaryl.wakefi...@hotmail.com>> wrote:
>> 
>> Anybody know of publicly available GitHub repos of real world Spark 
>> applications written in scala?
>>  
>> Adaryl "Bob" Wakefield, MBA
>> Principal
>> Mass Street Analytics, LLC
>> 913.938.6685
>> www.massstreet.net <http://www.massstreet.net/>
>> www.linkedin.com/in/bobwakefieldmba 
>> <http://www.linkedin.com/in/bobwakefieldmba>
>> Twitter: @BobLovesData <http://twitter.com/BobLovesData>

Reply via email to