Re: Spark performance testing

2016-07-09 Thread Mich Talebzadeh
Hi Andrew,

I suggest that you narrow down your scope for performance testing using the
same setup and doing incremental changes keeping other systematics the same.

Spark itself can run on local, standalone, yarn client and yarn cluster
modes So really you need to target a particular setup of run and a
particular application like SQL, streaming etc.

And then increment the memory keeping cores the same etc.

For test data you can create your own data using Linux shell scripts etc.
Then I would say the test will have more meaning.

HTH


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 9 July 2016 at 05:28, Andrew Ehrlich  wrote:

> Yea, I'm looking for any personal experiences people have had with tools
> like these.
>
> On Jul 8, 2016, at 8:57 PM, charles li  wrote:
>
> Hi, Andrew, I've got lots of materials when asking google for "*spark
> performance test*"
>
>
>- https://github.com/databricks/spark-perf
>-
>
> https://spark-summit.org/2014/wp-content/uploads/2014/06/Testing-Spark-Best-Practices-Anupama-Shetty-Neil-Marshall.pdf
>- http://people.cs.vt.edu/~butta/docs/tpctc2015-sparkbench.pdf
>
>
>
> On Sat, Jul 9, 2016 at 11:40 AM, Andrew Ehrlich 
> wrote:
>
>> Hi group,
>>
>> What solutions are people using to do performance testing and tuning of
>> spark applications? I have been doing a pretty manual technique where I lay
>> out an Excel sheet of various memory settings and caching parameters and
>> then execute each one by hand. It’s pretty tedious though, so I’m wondering
>> what others do, and if you do performance testing at all.  Also, is anyone
>> generating test data, or just operating on a static set? Is regression
>> testing for performance a thing?
>>
>> Andrew
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>
>
> --
> *___*
> Quant | Engineer | Boy
> *___*
> *blog*:http://litaotao.github.io
> *github*: www.github.com/litaotao
>
>


Re: Spark performance testing

2016-07-08 Thread Andrew Ehrlich
Yea, I'm looking for any personal experiences people have had with tools like 
these. 

> On Jul 8, 2016, at 8:57 PM, charles li  wrote:
> 
> Hi, Andrew, I've got lots of materials when asking google for "spark 
> performance test"
> 
> https://github.com/databricks/spark-perf
> https://spark-summit.org/2014/wp-content/uploads/2014/06/Testing-Spark-Best-Practices-Anupama-Shetty-Neil-Marshall.pdf
> http://people.cs.vt.edu/~butta/docs/tpctc2015-sparkbench.pdf
> 
> 
>> On Sat, Jul 9, 2016 at 11:40 AM, Andrew Ehrlich  wrote:
>> Hi group,
>> 
>> What solutions are people using to do performance testing and tuning of 
>> spark applications? I have been doing a pretty manual technique where I lay 
>> out an Excel sheet of various memory settings and caching parameters and 
>> then execute each one by hand. It’s pretty tedious though, so I’m wondering 
>> what others do, and if you do performance testing at all.  Also, is anyone 
>> generating test data, or just operating on a static set? Is regression 
>> testing for performance a thing?
>> 
>> Andrew
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 
> 
> 
> -- 
> ___
> Quant | Engineer | Boy
> ___
> blog:http://litaotao.github.io
> github: www.github.com/litaotao


Re: Spark performance testing

2016-07-08 Thread charles li
Hi, Andrew, I've got lots of materials when asking google for "*spark
performance test*"


   - https://github.com/databricks/spark-perf
   -
   
https://spark-summit.org/2014/wp-content/uploads/2014/06/Testing-Spark-Best-Practices-Anupama-Shetty-Neil-Marshall.pdf
   - http://people.cs.vt.edu/~butta/docs/tpctc2015-sparkbench.pdf



On Sat, Jul 9, 2016 at 11:40 AM, Andrew Ehrlich  wrote:

> Hi group,
>
> What solutions are people using to do performance testing and tuning of
> spark applications? I have been doing a pretty manual technique where I lay
> out an Excel sheet of various memory settings and caching parameters and
> then execute each one by hand. It’s pretty tedious though, so I’m wondering
> what others do, and if you do performance testing at all.  Also, is anyone
> generating test data, or just operating on a static set? Is regression
> testing for performance a thing?
>
> Andrew
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


-- 
*___*
Quant | Engineer | Boy
*___*
*blog*:http://litaotao.github.io
*github*: www.github.com/litaotao


Spark performance testing

2016-07-08 Thread Andrew Ehrlich
Hi group,

What solutions are people using to do performance testing and tuning of spark 
applications? I have been doing a pretty manual technique where I lay out an 
Excel sheet of various memory settings and caching parameters and then execute 
each one by hand. It’s pretty tedious though, so I’m wondering what others do, 
and if you do performance testing at all.  Also, is anyone generating test 
data, or just operating on a static set? Is regression testing for performance 
a thing?

Andrew
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org